Why are data scientists not data engineers

What types of data scientists are there?

Think of all the strengths that you need to play as a data scientist. You can have a background in statistics as you sometimes develop new statistical theories on large amounts of data. You can perform statistical modeling, experimental design, data reduction, sampling and clustering, testing, modeling, and predictive modeling. Then you have a solid background in math to help you perform analytical business optimization. You have also developed some business acumen when tackling ROI optimization and decision science topics. On top of that, you can be a data engineer who tweaks architectures and data flows. You would find your way around machine learning or production code development. But you are also strong at visualization and deal with topics such as geospatial data and graph databases.

The "Big Three" in data science

Data science is a very differentiated field that is becoming more and more diversified. There are over 10 different types of data scientist right now, and there could be more, depending on the taxonomy. But the classics remain, and these can be captured in the big three model of data science: the data analyst, the data engineer, and the data scientist.

The data analyst

A great analyst is a prerequisite for the success of all of your activities. A solid background in statistics is what gives rigor to data-driven decision making. A data analyst examines industry data to answer business-related questions and delivers those answers to the relevant teams. Data analysts transform large data sets, form hypotheses and communicate these to company decision-makers. You need to have a keen sense of the processes that take place outside of the data. Complex data analyzes and insights must be conveyed clearly to an audience without prior knowledge. So the data analyst examines the data, which includes cleansing and statistical analysis, and then visualizes the data to articulate the results. As a data analyst, you stick to the facts: That means dealing with the specific task, answering business questions and gaining insights from existing data.

The data scientist

This can include anything from analyzing data to building machine learning models that predict future developments based on historical data. In contrast to analysts, data scientists do not stick to the facts and have more space to develop their own ideas or to discover structures in the data. To identify these patterns, they analyze large amounts of complex structured and unstructured data. For example, you would perform complex evaluations and build and train a machine learning model. In addition to a solid background in statistics, a data scientist could also be trained in supervised and unsupervised machine learning methods.

The data engineer

This is a software development-intensive role that thrives on programming skills and the ability to make data usable for data scientists. Data engineers manage large data sets, handle data cleansing, aggregation and ETL processes, but also set up data pipelines to pass the data on to the analysts and scientists within a company. In this role, you can mainly deal with data collection tasks and the batch or real-time processing of collected data. Typically, you are also responsible for developing, building, testing, and maintaining the infrastructure that enables data to be stored and accessed. You also improve data quality and reliability.

Branch of data science

However, this is not the big picture. The big three have diversified into several specialty functions, some of which are not even considered part of the field of data science as such. Let's look at some of them:

The most popular

  • Machine learning engineer. At the interface between software engineering and data science, machine learning engineers are proficient in a variety of software tools and are well versed in providing practical software solutions. A machine learning engineer takes the theoretical model proposed by the data scientist and makes it usable in a production environment. MLEs create programs that control devices and develop algorithms that help machines identify patterns in their data, understand commands, and even learn to make their own decisions.
  • Machine Learning Scientists. Unlike machine learning engineers who specialize in building machine learning infrastructures, machine learning scientists focus on researching new approaches and algorithms. A machine learning scientist's outputs are reports and white papers.
  • Statistician. Works in theoretical and applied statistics with a view to corporate goals. With the help of mathematical techniques, statisticians analyze and evaluate statistical information and draw business-relevant conclusions from the data.
  • Business Intelligence Developer. With BI tools or the creation of their own applications for BI analytics, BI developers work on strategies that help companies improve their decision-making processes.

The architects

  • Data architect. Data architects develop, construct and maintain a company's data architecture solution and ensure the high availability of company data. You will create databases, develop structural and installation solutions and create design reports.
  • (Big Data / Cloud) infrastructure architect. Under the supervision of big data, cloud computing or the company's general data strategy, the infrastructure architect translates business requirements into specific system applications or process designs for IT solutions. The infrastructure architect ensures that the business systems work. He also ensures that the necessary system requirements are met and are able to support new technologies.
  • Enterprise architect. To ensure that a company uses the right technology and system architecture to successfully implement its business strategies, the enterprise architect designs the picture of a company's strategies and processes.
  • Application architect. This role includes the design and creation of new applications, as well as monitoring the behavior of existing applications within an organization. Application architects develop product prototypes, run tests, see how their applications interact with users, and create application development manuals.

As data analysis methods become more powerful and the volume of data collected continues to rise to unprecedented levels, the number and variety of roles in data science will continue to grow rapidly. Find out more about our job vacancies and training opportunities.


Subscribe to our magazine!
Stay up to date with the latest tips and news from data science and IoT.