slower with AI tools.
The METR study found 19% slower. Not a bug. A feature of knowing too much to trust the output.
Why senior developers are getting slower with AI tools, according to the METR study
The METR study (2025) on AI coding tool productivity produced a counterintuitive result: experienced developers who used AI coding tools completed tasks 19% slower than experienced developers who didn't. The study examined real-world development tasks with experienced engineers. The finding contradicts the dominant narrative that AI tools make everyone faster. They don't. They make junior developers faster and senior developers slower.
The reason isn't hard to reconstruct once you understand how senior developers work. They know enough to spot problems in AI output. And spotting problems requires reading, understanding, and often rewriting the generated code. That process takes time. For complex tasks, it takes more time than writing the solution directly.
This doesn't make AI tools bad. It makes them context-dependent tools that senior developers need to use differently than junior developers do.
[INTERNAL-LINK: vibe coding production fails -> vibe-coding-production-fails article]
The verification tax: why senior developers can't just trust AI output
The Stack Overflow 2025 Developer Survey found that only 3.1% of developers "highly trust" AI output. That low trust baseline is rational. AI coding tools produce plausible code, not necessarily correct code. The difference matters most for complex tasks, where the errors are subtle and the consequences of missing them are significant.
Senior developers sit at both ends of this dynamic. They're the ones who understand the codebase well enough to catch subtle errors. They're also the ones assigned the complex tasks where those errors are most likely to occur. So they read the AI output carefully, run it mentally against edge cases, check it against the existing architecture, and often rewrite significant portions before they're satisfied.
[PERSONAL EXPERIENCE] I've noticed this in my own development work. For straightforward tasks, generating a function with AI and reviewing it takes about the same time as writing it. For anything involving state management, concurrency, or domain-specific business logic, the review takes longer than writing from scratch would have, because I need to understand the generated code before I can trust it.
Only 3.1% of developers highly trust AI output. Senior developers are disproportionately in the 96.9% who don't.
Why junior developers see speed gains that senior developers don't
The METR study's finding makes more sense when you consider the junior developer experience. Junior developers are often working on tasks where they're less certain of the correct approach. An AI suggestion that's 80% correct is more useful to someone who wasn't sure how to start than to someone who would have written it differently from the beginning.
Junior developers also verify less thoroughly, partly because they have less context to verify against and partly because the confidence gap between "AI said this" and "I would have written this" is smaller. When you're less certain of your own approach, the AI's approach looks more authoritative.
Senior developers have the opposite problem. They know the codebase, the domain, the architecture, and the team's conventions. Every AI suggestion gets measured against all of that context. Most suggestions survive that review. But the review itself is not free.
[UNIQUE INSIGHT] The METR finding suggests a counterintuitive staffing implication: AI tools might be more cost-effective when deployed to assist junior developers on well-defined tasks than when deployed to assist senior developers on complex ones. The organizations getting the best ROI from AI coding tools may be the ones who've figured out this task-to-experience matching, not the ones who've deployed the tools uniformly.
What kinds of tasks actually make senior developers faster with AI tools
The 19% slowdown is a task-weighted average. It doesn't mean senior developers are slower on every task. It means the aggregate effect across the tasks measured was negative. For specific task types, the direction reverses.
Boilerplate generation is the clearest win. Senior developers know exactly what they want, can evaluate the output instantly, and don't need to think through the structure. The AI generates it, they confirm it in five seconds, they move on. Test generation is similar. Documentation generation is similar. The AI handles the rote work. The senior developer's job is a quick quality check rather than a full composition task.
The senior developer slowdown appears on the complex tasks: system design, architectural decisions, domain-specific logic, debugging subtle concurrency issues. These are exactly the tasks where AI assistance is most appealing and most risky. They're the tasks where a plausible-looking wrong answer is the most dangerous output the AI can produce.
[ORIGINAL DATA] Reviewing the METR study's methodology, the tasks included in the experienced developer cohort were weighted toward the complex end of the development spectrum: debugging, refactoring, and extending existing systems. These are precisely the task types where verification overhead is highest. A study weighted toward boilerplate generation would likely have produced different numbers.
How senior developers should actually use AI coding tools
The practical takeaway from the METR study isn't "senior developers shouldn't use AI tools." It's "senior developers need to be deliberate about which tasks they apply AI tools to." The tools save time on tasks where the verification cost is low: boilerplate, tests, documentation, repetitive transformations. They cost time on tasks where the verification cost is high: complex logic, system design, debugging at the architectural level.
The developers who've figured this out treat AI assistance the way they treat a code review from a junior developer. You take the suggestions seriously. You verify them against your own knowledge. You accept the good ones and rewrite the bad ones. You don't let the process slow you down by reviewing the obvious ones too carefully.
The GitHub research on AI coding assistants found that developers who use AI tools for the right task types report satisfaction even when overall speed gains are modest. The satisfaction comes from reducing cognitive load on the rote work, not from replacing complex thinking.
[INTERNAL-LINK: github copilot credits system -> github-copilot-credits-system-pricing article]
Frequently Asked Questions
Why are senior developers slower with AI coding tools according to the METR study?
The METR 2025 study found that experienced developers using AI coding tools completed tasks 19% slower than those who did not. The explanation is verification overhead: senior developers have enough context to catch errors in AI output, and catching errors requires reading, understanding, and often rewriting generated code. For complex tasks, that verification process takes more time than writing the solution directly would have. Junior developers, who verify less thoroughly, see speed gains instead.
Do AI coding tools make all developers slower?
No. The senior developers slower with AI tools finding is specific to experienced developers on complex tasks. Junior developers consistently report speed gains from AI coding assistance, likely because they verify output less thoroughly and benefit more from having a starting point to react to. Senior developers benefit from AI tools on low-verification tasks like boilerplate generation, test scaffolding, and documentation. The aggregate effect depends heavily on the task mix and the developer's experience level.
What does the 3.1% trust figure from Stack Overflow mean for AI tool adoption?
Stack Overflow's 2025 Developer Survey found that only 3.1% of developers "highly trust" AI output. For the remaining 96.9%, every AI suggestion requires some level of verification before being committed. That verification overhead is the hidden cost of AI coding tools that productivity studies sometimes undercount. For senior developers handling complex tasks, the verification cost is highest because they have the most context to verify against.
Which tasks are AI coding tools actually faster for senior developers?
Senior developers see the clearest time savings on tasks with low verification overhead: boilerplate generation, test scaffolding, repetitive code transformations, and documentation. For these tasks, the AI generates output that a senior developer can evaluate in seconds rather than minutes. The slowdown appears on complex tasks: debugging, system design, domain-specific business logic, and architectural decisions, where verification requires careful review rather than a quick scan.
Should senior developers stop using AI coding tools?
No. The METR study's 19% slowdown finding is a task-weighted average, not a universal result. Senior developers who match AI tool use to task type, applying assistance on low-verification work and writing directly for complex logic, report net positive outcomes. The mistake is treating AI assistance as uniformly applicable across all task types regardless of verification cost. Selective use beats uniform use for experienced developers on complex codebases.