The Future of the Data Engineer. Is the data engineer still the "worst"
- by Simon Harris
- May 14, 2023
- 2 mins
A few years ago, we had our first go at developing a concrete Data Engineering strategy.
This article I read this morning was validating as it touches on many of the challenges we identified and tried to address as part of our Data “Playbook”:
- Data engineers operate on a myriad fronts and to any one partner or stakeholder, it can seem like people are always working on other things.
- The Data Warehouse reflects the organisation. Chaos in, chaos out. Lack of consensus in, lack of consensus out.
- Data as a product, explicit and distributed governance and use of data, and modern tooling.
- Move away from just getting things done, to more traditional Engineering practices. That takes time.
Some of the data engineer’s biggest challenges: the job was hard, the respect was minimal, and the connection between their work and the actual insights generated were obvious but rarely recognized. Being a data engineer was a thankless but increasingly important job, with teams straddling between building infrastructure, running jobs, and fielding ad-hoc requests from the analytics and BI teams. As a result, being a data engineer was both a blessing and a curse. In fact, in Maxime’s opinion, the data engineer was the “worst seat at the table.”
It’s widely accepted that governance is distributed. Every team has their own analytic domain they own, forcing decentralized team structures around broadly standardized definitions of what “good” data looks like.
The data warehouse is the mirror of the organization in many ways. If people don’t agree on what they call things in the data warehouse or what the definition of a metric is, then this lack of consensus will be reflected downstream.
It’s not necessarily the sole responsibility of the data team to find consensus for the business, particularly if the data is being used across the company in different ways.
Nowadays, data teams are increasingly relying on DevOps and software engineering best practices to build stronger tooling and cultures that prioritize communication and data reliability.
While data team reporting structure and operational hierarchy is becoming more and more vertical, the scope of the data engineer is becoming increasingly horizontal and focused on performance and reliability — which is ultimately a good thing.
With the rise of these new technologies and workflows, engineers also have a fantastic opportunity to own the movement towards treating data like a product.