Discussion about this post

User's avatar
Dan Goldin's avatar

Great post! I love the idea behind DuckDB + Lambda but also want to see if anyone has used it for more complex cases. I'm thinking more about joins of high cardinality datasets and whether you can orchestrate that somehow. If you know the join key and the cardinality of the data you can do a sort of split at the high level (split dataset using something like "JOIN_ID % 10") and then kick off a variety of parallel tasks.

Expand full comment
Julian Gilyadov's avatar

Excellent post as always. Do you think it'll evolve into the multi-engine stack v1 or similar?

Expand full comment
6 more comments...

No posts