This is a different post from what I normally tend to write in that it’s more future looking than usual. It’s about how AI is and will continue to change software engineering.
You can’t go a day without hearing about people writing code with the assistance of Large Language Models (LLMs) — be it using models like Claude or ChatGPT directly, or tools like Github Copilot, Cursor and replit.
Most of these underlying models score incredibly high on the SWE-Bench benchmark and I expect the widespread adoption of Coding Copilots will only accelerate. You can see this happening across a broad range of users, from product managers and designers to seasoned software engineers.
I’ve been thinking about the implications of this adoption, especially in how it might impact software design, and why we should care.
SWE-Bench is fun, but if you look closely, it doesn’t really mirror real world product engineering
More code, faster
Models and tools like, Claude, Github Copilot and others have made coding more accessible than ever before. Virtually anyone, technical or otherwise, can use these models to create code snippets, prototypes, and even full applications. The democratisation of software engineering is well under way, whether we are ready for it or not.
The democratisation of software engineering is well under way
There’s no free lunch though — LLMs are trained on vast amounts of existing code from the web and open-source projects. However, the code they’ve learned from is often, well… average. Most of what’s out there isn’t exactly a masterclass in modularity, loose coupling, or maintainability. These tools are inherently limited by their training data, and while they can produce code that functional, it’s often the kind of code that prioritises getting it done over sound design principles.
This is great for prototyping and idea validation but isn’t conducive to building robust systems.
Whatever happened to Test-Driven Development
A natural evolution of being able to generate code is the ability to generate test code to ensure the code works as expected. On one level this makes sense — if you can automate writing tests, why not? It sounds like a win-win.
But it misses a crucial point about the role of tests, especially when written up front as in test-driven development (TDD). In TDD, you write a failing test first, before the actual implementation. This test acts as an executable specification for what the code should do. Critically, it also forces you to design your code in a way that is testable.
Tests then, are a design tool.
Testable code is modular code with clear separation of concerns. By writing the tests first, TDD guides you towards better design decisions. Tests then, are a design tool.
If we outsource test writing to a Coding Copilot, we risk losing that focus on design. LLMs don’t care about testability or modularity, they just mechanically produce test cases for the code. The tests become an afterthought rather than a driver of good design.
It’s worth noting that this applies mostly to unit and integration tests which are a lot closer to the code being verified. There’s a strong case here for automating end-to-end, performance and smoke tests as they treat the system as a blackbox, simply providing inputs and evaluating outputs.
The great unlearn
This over reliance on LLMs for code generation could lead to a gradual deskilling of software engineers, particularly when it comes to design. Less experienced engineers and students just starting their careers might be the most impacted by this trend.
If you can just ask an LLM for code whenever you need it, the incentive to learn software design principles and patterns diminishes significantly. Why spend time understanding SOLID principles when ChatGPT can give you a working solution with a few prompts at a fraction of the time?
Over time, I worry this leads to a generation of software engineers who can do no more than stitching LLM prompts together and lack the deeper understanding to design robust, maintainable architectures from first principles.
Does it even matter?
What I’ve written so far is only relevant under the critical assumption that code will still be written, read and maintained by humans. But what if that’s not the case? What if we're heading towards a future where more advanced agentic systems take over the task entirely?
What if these AI systems develop, deploy, and maintain code autonomously? There’s no human in the loop, evaluating design choices or refactoring messy architectures. Code is allowed to proliferate in an unconstrained manner.
Explainability debt: how can humans comprehend and trust the results created by AI?
Perhaps then, the quality of AI-generated code no longer matters in the same way it does now. The system isn’t built to be read or modified by humans — it’s designed solely for machines to interpret, evolve and run. Over time, codebases could evolve in increasingly convoluted ways, moving further and further from what we currently consider to be readable, well-structured code.
Trying to debug a system like that would be a nightmare if fundamental Observability principles aren’t followed. Imagine trying to debug a system like that. If something goes wrong, there’s no clear path to understanding why, because the layers of AI-generated code aren’t documented, modularised, or even comprehensible anymore. This creates a new kind of technical debt we might find hard to repay as it compounds and rots in ways we can’t predict or less control.
I call this the explainability debt: how can humans comprehend and trust the results and output created by machine learning algorithms? Explainable AI is an active research topic.
A dystopian look at autonomous code
A big challenge if we believe that’s the direction we’re headed, is that these AI generated systems wouldn’t necessarily be bound by the same design principles that guide engineers today. They wouldn’t naturally prioritise security, maintainability, or even logical coherence. These AI systems might evolve their own “design” conventions—patterns that make sense to the machine but look like a foreign language to us.
Snyk has done research on how coding copilots can amplify existing vulnerabilities in a legacy codebase. While not necessarily developing its own design conventions, it’s an example of the risk we run if we let AI take over.
AI systems might evolve their own “design” conventions
In a worst-case scenario, we might find ourselves in a situation where critical infrastructure—financial systems, healthcare software, transportation networks—is governed by a network of AI-written code, built over generations of machine learning models, with no real way to audit or intervene effectively. If a vulnerability or critical error is introduced, who would have the skills or the understanding to fix it? Would we even know where to start? (some might argue we are already there in some systems such as algorithmic trading)
There’s a term for this: algorithmic opacity. As these systems become more autonomous, they also become less explainable, meaning that even the engineers who designed the original systems can’t fully understand them. This is software running itself, forever patched and modified by machines.
Are we doomed?
In short, no.
[…] maintaining human oversight in code generation is essential. […] It’s a fundamental part of teaching the younger generation of software engineers […]
And I'm not here to say that AI and LLMs are inherently bad, far from it. They offer incredible efficiencies and have the potential to augment our work in ways we couldn’t have imagined a decade ago. But we need to think carefully about how far we want this automation to go.
For now, maintaining human oversight in code generation is essential. There’s value in humans remaining in the loop — designing, questioning, and challenging the output to ensure it’s safe, maintainable, and aligned with our broader goals as engineers. It’s also a fundamental part of teaching the younger generation of software engineers who are just getting started.
I’m an optimist. Code is a very small part of software projects. Between creating infrastructure, addressing scalability concerns and choosing architectural patterns, we are very, very far from AI taking over. But as we continue to integrate AI into the software development lifecycle, we should also be investing in ways to keep it understandable and manageable.
In a future article, I’ll explore how we need to adapt the way we hire software engineering talent by taking these new tools into account, instead of trying to shut them out. Subscribe to make sure you get it as soon as it’s published.
Over to you. How is AI changing the way you work? What systems have you put in place to ensure these advancements work for you’?
I think that is the point, YOU are in the driving seat and use copilot to assist. Absolutely agree on the fact that younger software engineers might have, a perhaps, significant impact on the way they will be doing software, and the reason why i think understanding computer science fundamentals along with systems designs and engineering principles and best practices are still a must, so that you can know when copilot is BSing you.
I've been using Windows Copilot instead of (Bing) search for a while, since it can search and then summarize search results much faster than I can, and it makes it easy to ask follow-up questions shorthand, rather than formulating those as new, standalone search queries.
I'm increasingly using GitHub Copilot in VS Code as well now and it amazes me how much better it's gotten in a short space of time. I mostly find it useful as a pair programmer, making suggestions, reviewing my code, bouncing ideas off it, and exploring new tools and libraries. But I am starting to use it to make edits to my code as well now, rather than just as a glorified auto-suggest, and that's also becoming more and more useful.
I think it'll be very interesting to see how our work as software developers evolves over the next few years, as LLMs continue to improve in speed and accuracy.