The Future of the Data Engineer. Is the data engineer still the "worst"
- by Simon Harris
- May 14, 2023
- 2 mins
A few years ago, we had our first go at developing a concrete Data Engineering strategy.
This article I read this morning was validating as it touches on many of the challenges we identified and tried to address as part of our Data “Playbook”:
- Data engineers operate on a myriad fronts and to any one partner or stakeholder, it can seem like people are always working on other things.
- The Data Warehouse reflects the organisation. Chaos in, chaos out. Lack of consensus in, lack of consensus out.
- Data as a product, explicit and distributed governance and use of data, and modern tooling.
- Move away from just getting things done, to more traditional Engineering practices. That takes time.
Some of the data engineer’s biggest challenges: the job was hard, the respect was minimal, and the connection between their work and the actual insights generated were obvious but rarely recognized. Being a data engineer was a thankless but increasingly important job, with teams straddling between building infrastructure, running jobs, and fielding ad-hoc requests from the analytics and BI teams. As a result, being a data engineer was both a blessing and a curse. In fact, in Maxime’s opinion, the data engineer was the “worst seat at the table.”
[…]
It’s widely accepted that governance is distributed. Every team has their own analytic domain they own, forcing decentralized team structures around broadly standardized definitions of what “good” data looks like.
[…]
The data warehouse is the mirror of the organization in many ways. If people don’t agree on what they call things in the data warehouse or what the definition of a metric is, then this lack of consensus will be reflected downstream.
[…]
It’s not necessarily the sole responsibility of the data team to find consensus for the business, particularly if the data is being used across the company in different ways.
[…]
Nowadays, data teams are increasingly relying on DevOps and software engineering best practices to build stronger tooling and cultures that prioritize communication and data reliability.
[…]
While data team reporting structure and operational hierarchy is becoming more and more vertical, the scope of the data engineer is becoming increasingly horizontal and focused on performance and reliability — which is ultimately a good thing.
[…]
With the rise of these new technologies and workflows, engineers also have a fantastic opportunity to own the movement towards treating data like a product.