A Year of AI-Assisted Development: The Honest Accounting

A year in, it's worth being precise about what changed and what didn't. The AI-assisted development story in tech media tends toward two equally wrong poles: either these tools are transformative and everything is different now, or they're overhyped and nothing works as advertised. My experience lands in neither place.

What the Year Produced

Let's start with what actually exists. Twelve months ago: Copilot in the editor, ChatGPT in a browser tab, no persistent context, full re-briefing overhead every session. Today: Copilot still in the editor; a local knowledge base with roughly 300 entries across five active projects; a retrieval proxy that enriches model requests with relevant context automatically; a CLI for fast knowledge capture; and a clearer picture than I've ever had of where AI assistance fits into a data engineering workflow and where it doesn't.

That's real infrastructure. It took real engineering time to build. It is now part of how I work, not an experiment I'm evaluating.

What Actually Improved

Boilerplate overhead: significantly reduced. Copilot handles the structural scaffolding that used to consume a real fraction of implementation time. The time savings are concentrated in the right place — the mechanical parts of the work — and the quality bar on the generated scaffolding is high enough that review is fast.

Context continuity: materially better with the retrieval system in place. Sessions that would have started from zero now start with the relevant facts already present. I have data on this: the number of times I have to correct a model because it lacked project context has dropped by roughly half compared to the early months. That's the retrieval proxy working as designed.

Reasoning quality: dependent on context. When the model has good domain context, the reasoning is genuinely useful and saves time I would have spent thinking through the same problem manually. When the context is thin — new project, poorly documented domain, novel problem type — the model's reasoning is general and often misses domain constraints I would have caught immediately. The quality ceiling is my knowledge base, not the model.

What Didn't Improve

The maintenance burden is still real. Three hundred knowledge entries across five projects require care to remain accurate. The automation I built reduces the ingestion friction, but it doesn't make decisions about what's worth capturing or when an existing entry has become stale. That judgment is still mine.

Novel problem domains are still slow. When I encounter a completely new data system, a technology stack I haven't used before, or a business domain where I have no prior knowledge, AI assistance provides marginal help. The model doesn't know the specifics, and I don't have a knowledge base built yet. I'm on my own until I've accumulated enough context to make retrieval useful.

The trust calibration is constant work. Using a model confidently requires knowing its failure modes well enough to catch them. That calibration is different for different model versions, different task types, and different amounts of context provided. There is no stable calibration point — as models improve and change, the failure modes shift. You have to stay engaged with what the tool is actually doing rather than trusting it in autopilot mode.

The Honest Assessment

AI-assisted development, done deliberately, is worth the investment — for someone with a specific profile. You need enough engineering depth to evaluate AI output critically. You need enough time to build and maintain context infrastructure. You need enough tolerance for a tool that works well most of the time and fails in specific, non-obvious ways some of the time.

That's not a mass-market profile. It's the profile of a senior engineer who is willing to treat the AI toolchain as part of the engineering problem, not just part of the toolbox. The version of this that works without significant setup, without ongoing maintenance, without domain expertise in the user — that version doesn't exist yet. What exists is a set of tools that reward investment in proportion to the sophistication of that investment.

Year two will be about refining the retrieval system, extending the proxy to handle more complex injection patterns, and exploring whether the context management burden can be reduced further through smarter automation. The problems are clear. The solutions are still being built.

If you've been on a similar path this year and landed somewhere different, I'd genuinely like to hear where and why. As always, I'm here to help.

Read more