The most optimistic thing I believe about software right now is that this is one of the best times in my career to be an engineer who learned the craft the old way — before the tools got good.
I also know exactly how that sounds. A senior engineer arguing that senior engineers are the ones who'll matter is the least surprising, most self-serving position available to me. So name it now: I have a direct stake in this being true, which means the argument has to be audited harder than it feels like it needs to be, not less. I'll make the case, and then I'll spend the back half trying to knock it down — because a flattering thesis you haven't stress-tested isn't a thesis, it's a comfort blanket.
Here's the case.
This abstraction jump is not like the others
The reassuring story about AI and programming goes: every generation of tools abstracted away the layer below, and engineers just moved up. Assembly programmers fretted about C; C programmers fretted about Python; everyone migrated up the stack and the work got more productive, not extinct. AI is just the next rung. Relax.
I think that analogy is broken, and the break is precise. Every prior jump preserved deterministic reasoning. Moving from C to Python, you learned new syntax, but the machine underneath still did exactly what you told it, every time. The skill transferred because the kind of thinking didn't change — same cause-and-effect reasoning, new vocabulary.
The LLM layer is the first abstraction that is non-deterministic. Same input, different output. Ask twice, get two answers. Moving "up" this stack doesn't ask you to learn new syntax over the same reasoning — it asks for a different cognitive skill: reasoning about a system that is probabilistic by construction, knowing where it will silently fail, and engineering around that. The ramp isn't steeper. It's a different shape. And a different shape is not something tooling alone teaches you, because the tooling is the thing you're supposed to be reasoning about.
That's the whole disanalogy, and everything else follows from one mechanical fact about how these models fail.
The mechanism: the error carries no signal
Here is the heart of it, and it's the part I'm most confident about — call it 8 in 10.
When a human is wrong, the error is load-bearing. You hold a hypothesis, you flag your own uncertainty — I think it works like this, but I should check — you act on it, reality pushes back, and your internal model updates. The next attempt is informed by the failure. Even a badly wrong human intuition usually points somewhere real: a wrong model of how a system behaves still encodes a direction, a "it's something over here." You can build on a colleague's wrong hunch, because the hunch came from a model of the world that was tracking something.
An LLM hallucination has none of that structure. The model produces a confident, wrong answer with no internal representation of uncertainty separate from the output — structurally, it doesn't know what it doesn't know. And critically, when you ask again, you don't get iteration. You get regeneration: another independent sample from the same distribution. It is not learning from the last wrong answer, because there is no persistent model of why the last one was wrong. It's randomness dressed as retry.
That's why "keep a human in the loop" is a structural claim, not a sentimental one. The human isn't there for warmth. The human is the only component in the system with an error signal — the only part that gets more right by being wrong first.
| A human's wrong answer | An LLM's hallucination | |
|---|---|---|
| Carries uncertainty? | yes — flagged, often explicitly | no — confident by default |
| Points somewhere real? | usually — a wrong model still tracks something | no — confabulation from token statistics |
| Improves on retry? | yes — the error updates the model | no — regeneration is an independent sample |
| Can you build on it? | yes | no |
Context is the second half of the same problem
There's a related limit that matters even more in real codebases. Each LLM call is stateless. You can hand the model context — retrieval, memory, conversation history — but it isn't maintaining that context; it's handed a fresh window each time and performs continuity. Given the same context you can still get different results, because the context isn't a state the model holds, it's an input you re-supply.
Human context management is active and dynamic, almost all of it below conscious thought: we continuously reprioritize, drop what stopped mattering, and — the important one — flag contradictions. Say something that conflicts with a decision made twenty minutes ago and a competent engineer interrupts: wait, that breaks what we agreed earlier. An LLM's context management is passive and brittle: a fixed window, roughly uniform attention, no automatic contradiction detection, no salience updating. A production codebase is an enormous context problem, and it's exactly where these models degrade — attention dilutes, early constraints quietly drop — and exactly where experienced humans don't.
So context management becomes an engineering problem the human has to solve externally. Which is the hinge of this whole essay, because it points straight at the cure.
The turn: natural language can be made deterministic
If the disease is non-determinism, the obvious move is to ask whether non-deterministic communication can be boxed into something predictable. And the answer is yes — we have a fifty-year-old discipline for doing exactly that.
Natural language is non-deterministic; "build me a login screen" can mean a hundred things. But you can make it precise. With a spec. With clear separation of concerns. With modules that have contracts — defined inputs, specified outputs, stated invariants. The instant I wrote that sentence in my head, the punchline arrived: I wonder where we've seen that before.
We've seen it everywhere. It's the entire history of software engineering. The whole field is the project of taking messy, ambiguous human intent and constraining it — through interfaces, pre- and post-conditions, type boundaries, isolated failure modes — until it's precise enough to execute deterministically. The cure for LLM non-determinism is not a new trick. It's the discipline senior engineers already carry in their hands.
Treat a prompt system the way you'd treat an API: defined inputs, specified outputs, bounded surface, isolated blast radius. Don't let the non-determinism roam the whole program — quarantine it into small modules where a wrong answer is cheap and catchable, and put deterministic checks around it. You can't make the model deterministic. You can make the system deterministic enough to ship, by containing the part that isn't.
Why the advantage is the order you learned things
Now the part that I think is genuinely non-obvious, and the one I'd build the whole argument on if I could only keep one.
The moat isn't experience. "Experience matters" is a cliché and half-wrong. The moat is the order of acquisition. The engineers who can tame these systems learned deterministic reasoning first, then picked up the non-deterministic tooling second — and that sequence can't be reversed. You can't acquire the foundation after the fact by using the tools, because the tools are precisely what hides the foundation from you.
An engineer who started with the assistant on by default can use it fluently and never built the deterministic substrate to reason about when it breaks. That's not a knock on them — it's a structural accident of timing. The people who got paid to learn the old way before the new tools existed hold an advantage that isn't about being smarter or more senior. It's about having done two things in an order that the next cohort can't replicate.
And so the work itself has moved. The hard, unsolved problems are no longer inside the model — they're in the scaffolding around it: observability, evaluation pipelines, reliability and fallback logic, prompt and model versioning, output validation, cost and latency management. None of those are solved. All of them are deep systems problems. AI created a new class of hard engineering — and it's the class AI can't yet do, because doing it requires the deterministic, contract-first reasoning the models lack. That's what I mean by the golden age: not that engineering got easier, but that it got more valuable, in exactly the place the old discipline applies.
That's the case. Here's where I try to break it.
The weak flanks
The entry-pipeline paradox — and this is the real one. Follow my own economic logic to its end and it eats itself. If the tools absorb most of the grunt work, the entry-level rung is the first to go — there's less obvious use for a junior, and worse, fewer ways in. Fine for me, awful for the field, because today's juniors are the only supply of tomorrow's seniors. The advantage I'm describing is built on an order of acquisition that the next generation may not get to follow — there may be no on-ramp to the deterministic foundation at all. Taken to its conclusion, the thesis is self-terminating: a golden age for my cohort that quietly removes the conditions that produced my cohort. I don't have a clean resolution to this. I think it's a more interesting and more important problem than the thesis it undermines, and the honest move is to sit in the discomfort rather than wave it away.
The opportunity is real but clustered. The golden age is concentrated where companies are actually rebuilding around AI primitives. Most of the market — enterprise, fintech, travel, the long tail — is still in "bolt an AI feature onto the existing product" mode, where the hard systems problems above haven't materialized yet. So the value is directional and uneven, not a uniform boom. An honest version of the claim is "this is real, and right now it's lumpy," not "everyone wins everywhere."
The forecast is a low-confidence bet, and I want to be explicit about the number. All of this leans, implicitly, on the models not rapidly steamrolling past their current limits — on the next leap requiring architectural breakthroughs rather than just more data and compute. I'd put my confidence there at maybe 4 in 10. Genuinely: I don't know, and I don't think anyone does. So the rational posture is an asymmetric bet — act as if the thesis is right, because the actions it implies (learn the fundamentals deeply, build the scaffolding skills) are valuable even if it's wrong; but don't build a twenty-year plan that requires it to be right. Certainty on the mechanism; humility on the forecast. Conflating the two is how smart people get blindsided.
What I'm learning from the juniors
If the essay stopped there it would still be a senior engineer reassuring himself, and I'd deserve the eye-roll. So the honest ending is the reciprocal skill gap — the things the newer cohort does better than I do, which I've had to watch and learn.
- They ship before it's perfect, and they don't lose sleep over it. Those of us trained in the deterministic era over-engineer by reflex — designing for scale that never arrives, abstracting before there are two cases to abstract, optimizing before profiling. Watching someone ship a working product in a week while I'm still drawing boxes is humbling, and correct more often than I'd like.
- They think in natural-language contracts natively. Specifying behavior in prose is intuitive to them in a way it isn't to engineers who default to code. Which is a deeply ironic gap to have, given I just spent a whole essay arguing that natural-language-as-contract is the entire game.
- They're comfortable with "good enough." I'm still learning that one.
So the real shape of it isn't "my generation is safe and theirs isn't." It's that the deterministic foundation is a genuine, hard-to-replicate advantage and it comes bundled with habits that are now liabilities — and the people without the foundation have exactly the instincts I need to borrow. The golden age, if it's real, belongs to whoever can hold both: the old discipline and the new comfort with shipping into uncertainty. I'm trying to be that person. I'm not all the way there.