Why AI Can’t Do Architecture

9 min read · 2,098 words

LLMs are extraordinary at writing code. They are genuinely bad at architecture. Most people treat this as a temporary limitation – a capability gap that will close as models improve. It won’t. And understanding why tells you everything about what software engineering actually is.

Why LLMs Are Good at Code

The answer is in the training data. LLMs were trained on an enormous corpus of code – GitHub repositories, Stack Overflow answers, open source projects, documentation, tutorials. And here is the critical detail: a large fraction of that code is good.

Syntactically correct Python is everywhere. Idiomatic TypeScript is everywhere. Well-tested React components, clean Go functions, properly structured SQL queries – the internet is full of them. The code that rises to visibility (starred repositories, accepted answers, cited libraries) is disproportionately well-written. LLMs learned from the best available examples of what code should look like at the expression level.

So when you ask an LLM to implement a function, write a test, translate code between languages, or produce a boilerplate component – it does these things well because it has seen thousands of examples of them done well. It is pattern-matching against a high-quality corpus.

Why LLMs Are Bad at Architecture

The answer, again, is in the training data. But this time the story runs in reverse.

Two panel illustration: Code - a vast well-lit library full of organised books. Architecture - a dark sparse library with almost empty shelves, searched by candlelight. — The training data asymmetry that explains LLM architectural blindness. Quality code is publicly visible and celebrated. Quality architecture mostly lives in private, hard-won, well-maintained systems that never get posted to GitHub.

LLMs were not trained on good architecture – because most architecture is bad.

The average codebase does not have clean component boundaries. The average GitHub repository does not demonstrate principled decomposition. The average production system – the one that actually runs real software for real organizations – is a tangle of historical decisions, accumulated coupling, and pragmatic shortcuts that made sense at the time. The architectural patterns that survive into the training corpus are mostly noise: God classes, layered monoliths where the layers have all blurred together, services that started as microservices and ended as distributed monoliths.

Good architecture is rare. It is rare in open source, rarer in enterprise software, and almost invisible in the kind of code that gets posted publicly. The training data for architectural reasoning is overwhelmingly composed of examples of what not to do.

So when you ask an LLM “should this be a separate service or stay in the monolith?”, “where does this business logic belong?”, “why is this component hard to test and what does that tell us?” – it reaches for the same pattern-matching mechanism that works for code. It finds patterns. Those patterns are mostly wrong.

The John Henry Problem

John Henry was a steel-driver on the Chesapeake & Ohio Railroad. When a steam-powered drilling machine arrived to replace him and his crew, he challenged it to a race. He won. He drove more steel than the machine in a single shift. Then he died – hammer in hand, heart given out from the effort.

John Henry driving steel on the left with sparks flying, a glowing server rack outputting green code on the right, a railroad track running between them. WPA poster style. — John Henry won a single race. The steam drill won every race after that. The developers who are racing LLMs at writing code are John Henry – and unlike him, most of them will live long enough to see it clearly.

The story is a tragedy not because John Henry lost. He won. The tragedy is that winning was meaningless. The steam drill went back to work the next morning. And the morning after that. It did not get tired. It did not need to recover. It got better with each iteration of the design, and every improvement cost the railroad less than paying a human crew.

The developers who are racing LLMs at code generation are John Henry. Some of them are winning individual sprints – typing faster, knowing more idioms, catching subtle bugs the model misses. The race is still lost. Not because LLMs are better today at every coding task. Because the trajectory is one-directional, the improvement pace is relentless, and the cost of the machine approaches zero while the cost of the human does not.

The question is not whether you can beat the machine today. The question is whether beating the machine is the right race to be running.

John Henry’s error was not in swinging the hammer. It was in believing that swinging the hammer was the point. The point was the railroad. The point was getting people and goods from one place to another. The hammer was infrastructure. When better infrastructure arrived, the hammer became optional. The engineers who designed the route – who understood terrain, gradient, load, and the economics of where the line should go – those people were not replaced by the steam drill. The steam drill needed them more than ever, because now it could move faster than anyone had anticipated and the decisions about direction became more consequential, not less.

The Argument That AI Will Take Developer Jobs Gets This Backwards

The standard AI-displacement narrative runs roughly like this: AI can now write code; developers write code; therefore developers will be displaced. It sounds logical. It isn’t.

It conflates “writing code” with “doing software engineering.” These have never been the same thing. Writing code is the mechanism by which software engineering decisions get expressed in an executable form. The decisions themselves – what to build, how to structure it, where to draw the boundaries, how to reason about change – are not coding. They were never coding. Coding was always just the tool.

The developers whose jobs are genuinely threatened are those who were doing coding without design thinking. Following patterns mechanically. Translating requirements into code without understanding why the requirements have the shape they do. Writing boilerplate. Filling in CRUD endpoints. Producing test coverage that satisfies a metric without revealing structural truth. AI does these things better, faster, and at lower cost. That is real displacement, and it is already happening.

But the developers who understand systems – who can identify what changes and why, draw boundaries that isolate change, reason about long-term structural evolution, derive a test strategy from an architecture rather than spraying assertions at code – those developers become more valuable, not less. Because the thing they do cannot be trained into a model that learned from bad architecture.

This Is What Every Abstraction Jump Has Always Done

We have been here before. Every major increase in abstraction removed a coding bottleneck and elevated the work that remained.

The abstraction staircase of software engineering. Each step removed a mechanical bottleneck. Each step left design thinking more exposed as the primary remaining constraint.

Assembly language meant you no longer needed to manage registers manually. C meant you no longer needed to write assembly. Java meant you no longer needed to manage memory. Frameworks meant you no longer needed to write infrastructure plumbing. Cloud platforms meant you no longer needed to manage physical servers. Each time, the question “will this eliminate programmers?” was asked. Each time, the answer was: it eliminates the programmers who were doing the lower-level task as their primary value. It elevates the programmers who were using the lower-level task as a means to express higher-level design.

AI is the largest abstraction jump yet – it removes the bottleneck of translating design into code at the expression level. The consequence is not that design becomes irrelevant. The consequence is that design becomes the only bottleneck remaining.

The Future of Work: Designing Systems That Intelligent Tools Can Build

The developers who thrive in this environment are not the ones who ignore AI tools or resist them. They are the ones who understand that the tool has changed what the job is.

The job was never to write code. The job was always to solve problems – and code was the unfortunate residue that problem-solving left behind. When a better tool for producing that residue arrives, the solver is freed, not replaced. What matters now is the quality of the thinking that directs the tool: what to build, how to bound it, where the seams should be, how to validate that the structure is right.

The new development loop. Architectural judgment sits at the center – directing volatility analysis, defining boundaries, validating what AI generates against structural rules, and learning from how the system evolves.

This is what architectural guardrails mean in practice. You do not give an AI system a blank canvas and ask it to design your architecture. You define the volatility boundaries, establish the communication rules, specify the component roles, and then use AI to accelerate the expression of what you have already designed. The guardrails are structural: AI can generate the Engine, but the decision that this business rule is an Engine – that it belongs in that tier, that it communicates through that interface, that it is testable in isolation from its dependencies – that decision is yours. It is not in the training data. It requires judgment that no corpus of GitHub repositories can provide.

The velocity gain is real and substantial. A team with clear architectural boundaries can direct AI tools at well-defined problems and get working code back in minutes rather than days. The guardrails do not slow the AI down – they direct it. They are the difference between a fast machine with nowhere to go and a fast machine on the right track.

The Valuable Skill Was Never Syntax

Here is what senior engineers have always known and junior engineers often have to learn the hard way: the hard part of software is never the code. The hard part is understanding the problem well enough to know what to build. Identifying which parts of the problem will change, and at what rate, and for what reasons. Drawing boundaries that anticipate that change rather than resist it. Deriving a structure that is coherent enough that a team of people can work within it simultaneously without producing chaos.

Coding fluency was always in service of this. You needed to be able to write code because that was how you expressed the design. But the design thinking came first. The best engineers were always primarily designers who used code as their medium – not coders who occasionally thought about design.

AI has not changed what the valuable skill is. It has made the valuable skill more visible by removing the skill that was obscuring it.

What This Means in Practice

If you are a developer thinking about how to position yourself in an AI-augmented world, the answer is not to become better at writing code. AI is better at writing code than you are, or will be soon. The answer is to become better at the thing AI cannot do: reasoning about structure, change, and design.

Concretely: learn to identify volatility axes in a problem domain. Learn to draw component boundaries that isolate change rather than bundle it. Learn to derive a test strategy from an architecture rather than writing tests to satisfy a coverage target. Learn to produce a project plan from a structural map rather than from a feature list. Learn to work with AI – directing its output, validating its suggestions against structural rules, catching the places where it has pattern-matched against bad architecture and produced something that compiles but does not hold together under change.

The conflict between AI and developer jobs does not hinge on whether AI can write code. It already can. It hinges on whether AI can design systems. It cannot – yet, and possibly not for a long time, because the examples it would need to learn from are rare enough that they barely appear in the corpus of human-written software.

The engineers who understand this are not threatened. They are in the best position of their careers: the tool that handles expression is now cheap and fast, and the judgment that decides what to express has never been more valuable. Stop racing the steam drill. Learn to design the railroad.

This is one of the reasons Harmonic Design exists. If architectural judgment is becoming the primary differentiator, it needs to be teachable, transferable, and systematic – not just something experienced engineers carry in their heads. The framework is an attempt to make good architectural reasoning explicit enough to be learned, applied, and validated. Read more about the framework here.