7 Comments
Jul 5Liked by Julien Hurault

What do you think about SCD2 for the bronze layer and then SCD1 for silver and gold. This way can still get the historical data if you want to but it's easier to query.

Expand full comment

I’m working on a project where I’ll have an SCD2 as the bronze layer and then my silver and gold with have incremental materialization which is sort of the same process here.

Expand full comment
author

The incremental there can get a bit complex in case of backfilling no ?

Expand full comment
author

interesting ! Yes indeed it would simply the queries. However you will still have a complex backfill process. What happens if you receive the updated version of a record in the bronze layer ? You run full mat' in silver and gold ?

Expand full comment

Could also use SCD2 as intended by its inventor, Dr. Kimball to store the surrogate key in the Fact table. Then no complex SQL required to match correct version of Dimension to the Fact. https://www.kimballgroup.com/2008/09/slowly-changing-dimensions-part-2/

Expand full comment
author

That's exactly the point. Storing the surrogate key in the fact table can get tricky: https://en.wikipedia.org/wiki/Surrogate_key#Disassociation

Expand full comment

Yes, but that is the point of surrogate keys. If you have an OrderID from one system, and you have to switch to add in orders from a new order system with the same OrderIDs, you will have to deal with that problem also. Can keep the original key as well.

Expand full comment