Module 01 - The Role of Analytics?
The first module of the introduction to data engineering consists of 9 videos. During this module, we will get acquainted with the subject of study and learn about key roles in data and what they do, as well as the other names they go by.
Most importantly, we will understand how they help businesses become more efficient and earn money. We will delve into typical architectures of analytical solutions and begin with the most basic exercise: analyzing sales data in a spreadsheet.
Module 1.1 Introduction
This module is aimed at acquiring a theoretical knowledge base for further work with the course, as well as understanding the role of analytics and data engineer in an organization. Before learning to work with the tools, it is very important to understand the business operations principle, how the business uses data, and how they can be beneficial. Typical architectural solutions and job vacancies for the data engineer role will be discussed.
Link to video To Be Added (TBA)
Module 1.2 The Role of Analytics in an Organization
An organization exists to bring some value (value). There are 3 key groups to whom a particular business might be beneficial:
- Business owners (shareholders);
- Employees (employees);
- Customers (customers).
The most important group is the customers, as many modern companies operate on the principle of “customer obsession”.
For a business to grow, it needs to create more value for each of these groups. For customers, for instance, this means enhancing the customer experience. For employees, it involves achieving a balance between work and life and providing a competitive salary (work-life balance, salary). For the owners, it means generating income.
For successful business growth and for these groups to perform their roles, decision-making is crucial. To make decisions, data is needed. Data can be in its raw form (raw data) or can be in the form of organized information (organized raw data). One of the tasks of a analyst or engineer is to provide data to the groups described above for further decision-making. Therefore, it’s very important to understand how exactly the work done by a different kind of data roles impact what happens with the business.
Module 1.3 Analytics Tasks
Analytics is a part of the business that uses data to obtain information based on which decisions are made for effective business operations. Analytics is necessary for:
- Increasing profit. If the analytical solution helps to make money, then everything is good. If not, there’s a problem somewhere.
- Reducing costs. Monitoring expenses helps save money.
- Exploring new markets or products.
- Compliance with requirements.
- Risk avoidance.
Link to video To Be Added (TBA)
Module 1.4 Key Roles in Analytics
There are many roles in data. Let’s try to group them together:
Traditional roles:
- BI Developer - works with reports, dashboards, implementing BI solutions (Tableau, Power BI, SAP). Often includes providing business recommendations (business insights).
- ETL/ELT developer - works with data integration (“techies”).
- Report Developer - same as BI Engineer
- Data mining specialist -
- Data Analyst - BI developer with focus on data analysis and data exploration (deep dives in data and business problem)
- DW Developer/Architect - works with the solution and its architecture (how it looks and functions).
- Data Modeler - engages with business processes of the organization and creates a data model for the future data warehouse.
Data Engineering roles:
- Data Engineer (classical understanding of a data engineer).
- Big Data Engineer - works with solutions related to non-relational databases.
- Cloud DE. Works with cloud-based solutions. Data Platform Engineer. Works with data warehouse and data lake related solutions. Specialized category (Data Science, IT):
Software Development Engineer. Strong knowledge of algorithms, data structures. Sometimes involves work with Big Data. Machine Learning Engineer. Proficient in mathematics, programming, and libraries/frameworks for deep learning. Visual Engineer. Works with data visualization using programming languages. Applied Scientist. Research Scientist. Advanced analytics category (Forecasting elements):
Data Science roles:
- Data mining - Data Science. Strong knowledge of mathematics, statistics, programming. Data Analyst = Data Science. The course almost entirely covers the traditional category and the data engineer category.
Module 1.5 Analytics and data engineering MindMap
For roles like data engineers, BI engineers, and data analysts, a comprehensive MindMap would also cover aspects related to their specific responsibilities, tools, and areas of focus. Here are some potential additions to your
Let’s put together the key terms in data:
- Data Modeling: Understanding and designing the structure in which data will be stored and accessed.
- Data Cleaning and Quality: Ensuring that data is accurate, reliable, and usable. Visualization Tools: Tools like Tableau, Power BI, Looker, etc., which are crucial for BI engineers and data analysts.
- Data Governance: Establishing rules, policies, and standards for data usage and quality.
- Reporting: Creating regular reports for business users that provide insights from data. Machine Learning: While traditionally the realm of data scientists, many data engineers are now required to deploy and sometimes even develop ML models.
- Data Pipelines: Automation processes that move and transform data from one system to another. Performance Tuning: Ensuring that queries run efficiently and databases operate at optimal speeds. APIs and Integrations: Especially for data engineers who might work on integrating various data sources.
- Data Security and Compliance: Ensuring that data is secure and that data handling practices comply with regulations like GDPR or CCPA.
- Collaboration Tools: Platforms like GitHub, Jira, or others that professionals use to collaborate on projects.
- Statistical Analysis: Particularly for data analysts, understanding and applying statistical methods to extract insights.
- Data Strategy: Understanding the broader strategy for how data will be used within an organization.
For each of the roles:
- Data Engineer: Focus on building, maintaining, and optimizing the flow of data from diverse sources.
- BI Engineer: Concentrate on designing and implementing the systems and tools that allow end-users to analyze the data.
- Data Analyst: Aim at processing, analyzing the data to extract insights, and often visualizing or presenting data in a form that’s accessible to business stakeholders.