Data-Free Data Engineering

We are working on better ways of doing data across the entire business, not just within data teams. We value:

That is, while there is value in the items on the right, we value the items on the left more

Fundamentally, as a data team we don't actually want to work with data. The data is not ours to own, nor do we understand it as well as the producers of the data.

As a data team, we exist to build good quality data platforms, tooling, and processes to enable the wider business to get value from their data. We are not here to manage your data, but we can provide consultancy and support for doing so.

This manifesto brings together the ideas of data meshes [1], data hubs, NoOps, DataOps and the latest generation of data platforms that we and our peers are working on. It is the latest generation of data platforms, and we see this rough evolution happening in many different companies, in both Europe and the USA:

What these more accessible and democratic data platforms look like is something we are still figuring out. It almost certainly involves the cloud, serverless data platforms [2], and automation [3] of these. It probably involves shared data dictionaries. It might involve some [4] method [5] of [6] joining together disparate data sources, or it might involve requiring standard data stores.

---

Written by WheresAlice on 15 June 2020.

References

[1] data meshes (https://martinfowler.com/articles/data-monolith-to-mesh.html)

[2] serverless data platforms (https://aws.amazon.com/blogs/big-data/our-data-lake-story-how-woot-com-built-a-serverless-data-lake-on-aws/)

[3] automation (https://aws.amazon.com/blogs/big-data/build-and-automate-a-serverless-data-lake-using-an-aws-glue-trigger-for-the-data-catalog-and-etl-jobs/)

[4] some (https://drill.apache.org/)

[5] method (https://prestodb.io/)

[6] of (https://www.dremio.com/)

---