Data Driven Decisions

A company which wants to start taking advantage of existing and latent information within itself has a long way to go. It is, unfortunately, necessary to wade through a swamp of concepts of buzz words such as “Business Intelligence,” “Data Science,” “Big Data,” and so on. Management may want to build “Data Warehouses” and store more of the data which is produced in the organization. But then we wake up from the dream.

All of this data locked into various databases just cost money if no one touches it. The database administrators might not let people interact with it because the database is not capable of executing the required queries. The lucky ones who obtain data can deliver information about the data itself, answering questions like “how many?”, “how often?” and “where?”. Using tools from the Business Intelligence domain charts and tables are shown as reports where the user has to gain insights by drilling through the heaps and dimensions of information. This is the lowest level of refinement.

So, when embarking the endeavor to improve the business value of the reports using analysis, companies should not start looking at how existing data could be utilized and how to collect new data and how to store data. These issues should, of course, be addressed later on, but this is not the first priority.

Think backward instead. Start thinking about what the organization needs to know to work better. This is up to the domain-experts of the company to find out. They should be aware of some low-hanging fruits already. Then continue by sketching the process backward. If the management of the company needs to know something to be able to make better actions, what is required to produce that information? Is the data required available or can it be created from other data? And so on. Think about what is already there (an inventory might be needed), what new metrics are needed and are they measurable, is the development of new measurement methodologies required, is a given measurement method returning what it is expected to?

There are several technical challenges on the way which could involve the “four V:s” of Big Data: volume (how much?), velocity (how often?), veracity (can the data be trusted?) and variety (is the data heterogeneous?). There are also soft challenges which need to be solved like tribalism in the organization, the attitude of people towards change and tearing down walls of organizational data silos.

For a data-driven system to shine, it needs to be able to deliver insights which represent deviations outside the normal range of operation. Based on the insights the system should provide a set of recommended actions to improve the situations. The recommendations are presented together with an impact analysis if the action would be applied. This way, decisions are documented along with the facts available at the moment of decision.

Basing decisions on data is nothing new, but new technology helps to make data available sooner than previously. Having an expert system recommending actions and predicting the outcome can be a real time and money saver. Making the decision making fully automated as well is, however, an entirely different beast. Humans like being in control and it very hard to verify the correctness of wholly automated decision systems. Humans can help introducing some suspicion and common sense when recommendations do not make sense. Extra scrutiny can then be called upon to make sure there are no strange artifacts in the underlying data.

Data-driven decisions will most certainly gain traction in the future as increasingly more complex decisions can be handled. Technology is seldom a stopper. The number crunching can be solver. Company culture and politics is the primary stopper and must not be forgotten when setting up a new project.

Special Announcement: We are hiring!

 

image from stocksnap.io/photo/DDYC9U7O2P

 

Hi all! you may have been following the blog for some time, or this may be your first visit. But just the fact that you are here, taking some time to read and learn new things is awesome. Well, guess what: we want to hire awesome people who enjoy learning new things. More specifically about data science.

Now, what do we mean with “data science”? there are a number of tasks that every person working with data must have done: at some point you must have decided what kind of data you needed for your task; you must have had an idea for how to collect and transform that data to a form in which you can apply some nice math / statistics to it, and finally you must have found a way to summarize interesting aspects of the whole ordeal. 

Some people find different parts of this process more confortable than others. Some people are better at communicating and telling stories, while others are more into applying math over the data and see what comes out; others just love designing and optimizing experiments; others love building tools for data extraction and conditioning. And the thing is: all of this is data science. All of it. And when a company wants to hire a person in data science, is important to specify what do they want, because all these tasks require different sets of abilities which are very difficult to find in a single individual. 

We are putting together a team. The engineers we are looking for thrive as a team players, and have a genuine interest in data analysis, statistics and mathematics. If you are one of them, you have built a sufficiently good programming base (preferably in Python and/or R) which allows you to learn and test ideas by yourself. We want you to be free to mold your mind into what you want, and have a good time.

The Team Roles: 

Data Analyst: Is concerned with looking at the data and the context around the data, and telling a story by analysing both. This engineer knows what type of visualization is best to use, for which audience, and how to build it. This engineer can also choose the best statistics to summarise a dataset, and can help organizations build suitable Key Performance Indicators.

Data Engineer: Is concerned with extracting, molding, recommending and building hygiene methods and choosing computing frameworks to work with data. Data nowadays can be found on all kinds of formats and be stored in different ways. It can be streaming or offline. The needs of a client may have been satisfied by optimising how your programs process the data, or they may have pushed the envelope of what can be accomplished by CPUs, possibly making this engineer look into GPUs. This engineer is responsible with providing quality data over which data analysis and data science can be applied.

Data Scientist: Is concerned with building quantitative models out of data, setting and verifying the assumptions for the models, testing and maintaining models, and choosing data collection strategies and instruments. This engineer knows what things can go south very quickly when the underlying assumptions for a model are no longer valid, and is responsible for clearly communicating his team what those assumptions are.

Must-have qualifications:

  • M.Sc. in engineering with a focus on one or more of the following disciplines; statistics, mathematics, computer science, applied physics, mechatronics, electrical/electronic/nuclear engineering.
  • Fluent English
  • Experience in Python or R.

The Attitude:

  • You like having fun!
  • You are friendly!
  • You respect and trust your team as much as your own knowledge.
  • You shoot for the stars, yet can gracefully land on the moon if needed.
  • You learn on your own, yet you ask for help when you need to.
  • You don’t take all statements for facts: when reason exists, you verify ground truths and communicate your findings to those concerned.

Nice-to-have:

  • Experience from working in teams
  • Customer-oriented experience
  • Experience with community projects (Github, CRAN, StackExchange community, etc)

if this sounds a lot like you, do not hesitate to apply with your CV and cover letter here:

https://www.linkedin.com/jobs2/view/254519957