I agree agents are a game changer for productivity, but they still seem to always be behind the times, for example, when working on new Databricks features, I have to constantly feed relevant docs and examples, say "No, do it this way," or "Read this".
Also, I find that unless the data engineering task is simple to mid, they tend to write so-so code, and given what they are trained on, it's not a surprise. I find it's like dealing with a junior to mid-level engineer, with me saying, "Are you sure you want to do it like that?"
The future is bright, I think, for those data engineers who can build AI systems and be systems and architectural designers.
Yes, exactly—the challenge is probably being able to provide additional guidelines and best practices to level the LLM up and, right? Once he has the right specs, he get really good at implementing what you want in the format you want it.
I think the documentation problem will be resolved soon. It’s mostly a matter of ensuring LLMs can reliably access the correct source of truth.
- Human are not very good at writing/reading a big bunch of text and keep the full context (our brains don't have 1M token window). Switching from specs to code doesn't fix the cognitive overload though. A 100k-line codebase is also too big to hold in our head. The real question isn't text vs. code in the end, it's: what representation lets humans reason about intent at scale?
- English is a terrible programming language (programming in the broader sense here) https://orbistertius.substack.com/p/english-is-a-terrible-programming. Writing code was mainly for human to read (our future selves/teammates). But reading code is complex because the semantic is dense (vs. a bunch of sentences where the meaning is not "compiled" by every human the same way)
I am not saying "specs are bad" but more that we don't yet have good representations of design decisions that survive the transition to AI-generated implementation.. yet. Or we should all improve our writing/reading skills to be better at thinking/designing and expose our intent. (kinda resonate with https://boz.com/articles/communication-is-the-job)
Maybe we should use more visuals ? add detailed diagrams to our specs with data flows, architecture etc ? Maybe specs should come with a pre-defined list of stories and tasks ?
I agree agents are a game changer for productivity, but they still seem to always be behind the times, for example, when working on new Databricks features, I have to constantly feed relevant docs and examples, say "No, do it this way," or "Read this".
Also, I find that unless the data engineering task is simple to mid, they tend to write so-so code, and given what they are trained on, it's not a surprise. I find it's like dealing with a junior to mid-level engineer, with me saying, "Are you sure you want to do it like that?"
The future is bright, I think, for those data engineers who can build AI systems and be systems and architectural designers.
Yes, exactly—the challenge is probably being able to provide additional guidelines and best practices to level the LLM up and, right? Once he has the right specs, he get really good at implementing what you want in the format you want it.
I think the documentation problem will be resolved soon. It’s mostly a matter of ensuring LLMs can reliably access the correct source of truth.
Issues I see with specs. (text) only:
- Human are not very good at writing/reading a big bunch of text and keep the full context (our brains don't have 1M token window). Switching from specs to code doesn't fix the cognitive overload though. A 100k-line codebase is also too big to hold in our head. The real question isn't text vs. code in the end, it's: what representation lets humans reason about intent at scale?
- English is a terrible programming language (programming in the broader sense here) https://orbistertius.substack.com/p/english-is-a-terrible-programming. Writing code was mainly for human to read (our future selves/teammates). But reading code is complex because the semantic is dense (vs. a bunch of sentences where the meaning is not "compiled" by every human the same way)
I am not saying "specs are bad" but more that we don't yet have good representations of design decisions that survive the transition to AI-generated implementation.. yet. Or we should all improve our writing/reading skills to be better at thinking/designing and expose our intent. (kinda resonate with https://boz.com/articles/communication-is-the-job)
Yes I do agree.
Maybe we should use more visuals ? add detailed diagrams to our specs with data flows, architecture etc ? Maybe specs should come with a pre-defined list of stories and tasks ?