Designing Data Platforms Like BMW 🚗 Factories

Ju Data Engineering Weekly - Ep 48

Feb 07, 2024

I started my career not in data engineering but as a production engineer in the R&D department at BMW in Munich.

This was a valuable experience that let me learn a lot about one of the top production systems in the world.

It's known as one of the best because companies like BMW can make products that are customized on a large scale, which is rare.

BMW customers can indeed customize their cars up to 15 days before production starts (at that time), and BMW produces more than 2 million cars a year.

Truly remarkable.

Each BMW client can fully customize his car

While I was there, I ran simulation models of the factory's material workflow to make them more efficient.

This experience gave me a clear picture of how the system works.

In this article, we'll compare how cars are made to how data platforms are built and see what we can learn from this battle-tested production system.

How do we make cars?

The process of making cars is usually split into three main parts:

Body Shop: This is where the car's frame (chassis) is welded together.
Paint Shop: This is where the frame gets painted.
Assembly: This is where parts like seats, engines, and wheels are put together to finish the car.

These segments employ different production strategies.

Body Shop and Paint Shop follow a so-called pull strategy: parts are produced without any client customization. These processes use many robots and are mostly automated.

Production of the BMW X7 Body Shop — Body shop

The new BMW 8 Series Convertible in the paint shop at BMW Group Plant Dingolfing. (11/2018) — Paint shop

The assembly, on the other hand, follows a push pattern, where each is made for a specific order. Here, they pay a lot of attention to quality checks done by hand, and many jobs need people to do them because they can't be automated.

Car manufacturers have managed to produce customized products at scale by:

Decoupling the order-specific workflow from the generic ones (push vs. pull) and bridging them with a central storage.
Automating the production of generic parts (body and paint shop).
Implementing strict quality assurance processes in the customization workflow (assembly).

Parallel With Data Platforms

Zach Wilson

discussed the pull-push systems in his latest newsletter.

This immediately resonated with me and prompted me to consider this analogy.

Let’s unbundle it:

The role of a data platform is to source data from various sources and create custom artifacts for use by downstream systems.

Similarly, in car manufacturing, the goal is to source raw materials and parts, transform them, and deliver custom-built cars.

In this analogy:

Raw data is like steel: the raw material.
Our ELT is equivalent to the body/paint shops.
The painted chassis represents the staging tables of our data platforms.
Custom views/dashboards are akin to customized cars.
The storage between the body shop and assembly line is similar to our data lake; it stores standard parts ready for downstream customization.

Lessons for Data Platform Design

Car making uses this design because it's likely the best method to ensure high-quality products without sacrificing too much latency.

Indeed, a full push system, like complete streaming, would make it hard to check if records are consistent across datasets.

On the other hand, a full push system will probably not work at scale as the batches may become too big and you will have to sacrifice too much latency.

This brings us to the Lambda vs Kappa architecture debate.

The car manufacturing industry has chosen a Kappa design where the platform is divided into two sides:

the “write” side: data are written in the data lake using streaming (push pattern)

the “read” side: data are clean, aggregated, and exposed using batches (push pattern)

By having different methods for data input and output, the system finds a good balance between speed, accuracy, and scaling.

A push system is used for activities with low latency and high throughput, such as data ingestion (or body/paint shop).

Low latency enables a tight connection with upstream sources and helps handle bursts or peaks in volume.

It can be easily “automated” as no orchestration is needed for streaming systems.

However, correctness and consistency are hard to achieve as records are ingested independently.

For that reason, a pull production system is used for activities where high quality is desired (data transformation or assembly).

In this case, the pull system optimizes for correctness but sacrifices latency. A pull system requires an additional orchestrator to synchronize all 'orders' and quality tests.

Another reason car production is so efficient is how factories work closely with their suppliers.

Suppliers not only deliver parts precisely when needed (just-in-time) but also in the exact order they will be used in production (just-in-sequence).

When visiting a factory, I remember seeing batches of car seats delivered by a supplier in the exact order of cars set to be assembled just a few hours later. Truly amazing.

I believe this is a great source of inspiration for our industries.

We could become much more efficient by integrating data providers more closely into our data supply chain: better communication, enforced data contracts, increased data sharing, etc.

Although not perfect, I believe this analogy can be helpful when making decisions about the design of a data platform.

I would appreciate your feedback on this analogy and confront it with your field experience!

Thanks for reading,

-Ju

I would be grateful if you could help me to improve this newsletter. Don’t hesitate to share with me what you liked/disliked and the topic you would like to be tackled.

P.S. you can reply to this email; it will get to me.

Ju Data Engineering Newsletter

Discussion about this post