The Productivity Paradox - Where Do AI Gains Go in a Time-and-Materials World?

"If you're an IT services player billing by the hour, and your clients now expect the same deliverables faster and cheaper thanks to AI, that hits your margins and your pyramid at the same time." — Everest Group, February 2026
Introduction - A Model Built for a Different Era
The time-and-materials contract is one of the oldest and most durable commercial arrangements in professional services. It is honest in its simplicity - you pay for the time of skilled people and the cost of materials they consume. Its durability comes from a shared premise - that expertise is scarce, that the work is complex, and that predicting effort upfront is genuinely hard. For decades, the software services industry built itself on exactly this premise.
Artificial intelligence is now attacking all three of those pillars simultaneously.
AI-augmented software engineering is compressing the time that tasks take, commoditising portions (not every aspect) of work that were once exclusive to skilled practitioners, and making certain kinds of scope more predictable than ever before. The productivity gains are real - even if their magnitude is disputed and their distribution is uneven. And this creates a structural puzzle for every software services company still operating on a time-and-materials basis - when AI makes your engineers faster, where does the value go?
The answer is not straightforward. It depends on who captures the efficiency, how contracts are written, how clients are educated, and - ultimately - how bold services firms are willing to be in reinventing their commercial models. In this post we examine all of those dimensions in depth.
Part I: What the Productivity Data Actually Says
Before examining the commercial implications, it is worth grounding the conversation in what the research actually shows about AI's impact on software engineering productivity - because the picture is more nuanced than the headlines suggest.
The Optimistic View
Gartner's analysis projects that teams applying AI tools comprehensively across the entire software development lifecycle - not just code generation, but planning, testing, documentation, and maintenance - will achieve 25–30% productivity gains by 2028, up from roughly 10% for code-generation-only approaches in 2024. Early studies by GitHub suggested developers could write code as much as 55% faster on certain tasks. Faros AI's analysis of over 10,000 developers found that engineers on high AI-adoption teams complete 21% more tasks and merge 98% more pull requests than their peers. These are meaningful numbers, and they represent genuine value creation.
The Paradox Beneath the Headlines
The more rigorous picture complicates the celebration. A 2025 METR randomised controlled trial - conducted with experienced open-source developers working in mature codebases they knew well - found that AI tools actually increased task completion time by 19%. Developers themselves predicted AI would make them 24% faster. The reality was the opposite. The finding does not mean AI is useless - it likely means that for complex, context-dependent work in unfamiliar or legacy environments, the cognitive overhead of directing, verifying, and correcting AI outputs can exceed the time saved.
The Faros AI Productivity Paradox report found something equally telling - while individual developers and small teams show measurable throughput gains, 75% of engineers now use AI tools, yet most organisations see no measurable performance gains in delivery. The bottleneck moves. AI accelerates code generation only to expose the constraint of code review, which in the same dataset saw PR review times increase by 91% - nearly doubling - as AI-assisted developers submitted more code, more frequently, with more issues requiring scrutiny.
The 2025 DORA report crystallises this dynamic most precisely. AI, it concludes, does not automatically improve software delivery performance. Instead, it acts as a multiplier of existing engineering conditions — amplifying the performance of teams with mature DevOps practices, and amplifying the dysfunction of teams without them.
What This Means for Services Firms
The implication for T&M services companies is significant. Productivity gains from AI are real but uneven, contextual, and conditional. They are most reliable in greenfield development, well-scoped work, and teams that have invested in AI workflow integration. They are least reliable - and can even be negative - in legacy modernisation, complex integration work, and mature codebases with deep technical debt. The irony is that complex, legacy-heavy work is precisely where enterprise clients most often hire services firms.
This creates a non-trivial asymmetry - firms that rush to promise AI-driven productivity gains may find themselves delivering exactly the kind of work where AI struggles most.
Part II: The Fundamental Structural Problem
The Efficiency Trap
The core commercial problem with AI augmentation in a T&M model can be stated simply. Under T&M, revenue is a function of two variables: rate × hours. AI increases output per hour. If the number of hours billed falls proportionately, revenue falls. The efficiency gain is passed entirely to the client, and the services firm absorbs the investment cost of AI tooling, training, and workflow transformation without a corresponding financial return.
This is not a hypothetical. Simon-Kucher's research on professional services firms finds exactly this - "If 10 hours become five hours of work, clients should expect an hourly fee at a higher rate." But in competitive markets, clients rarely accept rate increases driven by vendor-side efficiency improvements. They interpret faster delivery as a signal that the work was overpriced to begin with.
The problem runs deeper still. The traditional IT services pyramid - large volumes of junior engineers performing well-defined, repeatable tasks, supervised by smaller numbers of senior staff - is precisely the structure that AI threatens most directly. As Everest Group's Abhivyakti Sengar describes it, this is structural rather than cyclical - "The biggest losers aren't countries, they're business models. Anything priced on human throughput is now competing with software that can do the throughput." Application services - custom software development, deployment, and maintenance - account for 40–70% of revenues for major IT services firms. This is the segment most exposed to AI compression.
The Pyramid Flattening
Entry-level and mid-level software engineering tasks are the most vulnerable. AI can already handle significant portions of code generation, test case creation, documentation, basic debugging, and ticket triage. These are precisely the tasks that have historically justified large junior-engineer headcounts. Indian IT majors - TCS, Infosys, Wipro - have already felt this structurally - their shares fell 6% in a single session in February 2026 as the market began to price in the reality that AI tools could allow clients to do more with fewer contracted staff.
This does not mean junior engineers become worthless. It means their role changes — from code authors to AI directors, reviewers, and integrators. But the commercial consequence is real - if ten junior engineers can now do the work of fifteen with AI assistance, and a client knows this, the pressure to reduce headcount billing is fierce.
The Attribution Problem
Perhaps the most practically frustrating challenge is one of attribution. If your team ships a feature in six days instead of twelve, how much of that gain was AI, how much was team seniority, how much was the clarity of requirements, and how much was just that this particular problem happened to be well-suited to the tools at hand? CloudZero's research found that only 51% of organizations strongly agree they can track AI ROI effectively, even though 91% claim overall confidence in evaluating it - a telling gap between perception and measurement precision.
For services firms, this attribution fog cuts both ways. It makes it harder to prove AI value to clients, but it also provides some cover when AI-assisted work does not deliver the expected gains. Neither outcome is commercially sustainable.
Part III: Who Actually Captures the Gains?
This is the question that sits at the heart of the T&M dilemma, and the honest answer is - it depends entirely on what your firm does with the time saved.
Scenario 1: The Firm Pockets the Margin (Short-Term Win, Long-Term Risk)
If an engineering team of ten can now do the work previously requiring twelve, and the client is billed for ten engineers at the same rates, the firm captures the gain in the form of improved utilisation and the ability to redeploy two engineers to other projects. This is the short-term default behaviour for most firms. It works, until it doesn't.
The risk is that the client eventually figures it out - through benchmarking, competitive bids, or their own growing AI literacy - and begins demanding rate reductions, smaller teams, or faster delivery at the same cost. When the efficiency gain becomes visible, it becomes a negotiating lever in the client's hands.
Scenario 2: The Client Gets It All (Competitive Pressure)
In highly competitive T&M markets, firms under price pressure pass AI gains directly to clients through reduced hours, smaller teams, or price reductions to win bids. The firm invests in AI tooling and training, compresses its margins to match lower-priced competitors, and ends up subsidizing client-side productivity improvements from its own P&L. This is the worst commercial outcome - and it is the default trajectory for undifferentiated services firms in commoditized markets.
Scenario 3: The Gains Are Reinvested in Quality and Speed (The Right Move)
The most strategically sound path - and the one supported by Gartner's analysis - is to reinvest AI-generated time savings into activities that create client-visible value without reducing billable hours. This means using gained capacity for refactoring technical debt on the go, more thorough testing, better documentation, architecture review, security analysis, and proactive problem identification. The DORA report's insight applies here - AI amplifies the quality of the engineering system around it. A firm that uses AI to improve its engineering quality, not just its velocity, creates value that justifies its rates without surfacing the efficiency gain as a negotiating target.
Scenario 4: The Firm Transitions to a New Commercial Structure
The fourth and most forward-looking response is to use AI gains as the enabling condition for a different pricing model altogether - one where the firm charges for outcomes, not hours. This path is harder and slower, but it is the only one that makes AI augmentation permanently accretive to firm economics. We will address this in detail in Part V.
Part IV: Showing the Gains to Clients - The Demonstration Problem
Even when a services firm genuinely believes it is delivering more value through AI augmentation, communicating and proving that value to clients is a distinct and difficult challenge.
The Trust Deficit
Clients have become justifiably sceptical of vendor AI claims. The gap between vendor marketing ("10x productivity") and empirical reality (20–30% gains, often offset by new bottlenecks) has eroded credibility. When a firm says "our engineers are AI-augmented", a sophisticated client hears, "we are billing you for work that our tools are doing." The burden of proof falls entirely on the services firm.
Metrics That Matter to Clients
The traditional T&M dashboard — hours logged, tickets closed, sprints completed — is inadequate for demonstrating AI-derived value. Clients care about a different set of outcomes entirely.
Velocity metrics: How quickly are features going from specification to production? Cycle time, lead time, and deployment frequency are the DORA metrics that map most directly to business impact. If AI is being used well, these should improve. Track them, publish them, make them part of the contractual conversation.
Quality metrics: Defect escape rate, change failure rate, and mean time to recovery tell clients whether faster delivery is also better delivery. This matters especially because AI-generated code carries real quality risks - "code churn" (code discarded within two weeks of being written) is rising, and copy-pasted AI-generated code that lacks architectural cohesion is a well-documented problem. Demonstrating that your AI governance practices keep quality high is a genuine differentiator.
Value metrics: Feature business impact, user adoption of delivered functionality, reduction in client-side rework and maintenance burden. These require a deeper partnership with the client, but they are the metrics that transform a services relationship from a cost centre conversation to a value creation conversation.
The Transparency Imperative
One of the most underrated tools for demonstrating AI value is radical transparency about AI usage itself. Firms that publish AI adoption rates, the specific tools in use, their internal QA processes for AI-generated code, and their governance frameworks for AI risk (hallucinations, IP exposure, security vulnerabilities) are building a different kind of trust than the opaque "we use the latest tools" positioning.
Clients are increasingly asking for this disclosure. Evolving buyer expectations in 2026 include explicit demands for real-time dashboards tracking resource utilisation and AI-assisted productivity metrics. Firms that get ahead of this rather than waiting to be asked will find it positions them as a premium partner rather than a commodity vendor.
The Joint Baseline Problem
One of the most practically useful things a services firm can do at the start of any engagement is establish a joint baseline - a mutually agreed measurement of current velocity, quality, and cost - so that improvements attributable to the engagement (and specifically to AI augmentation) can be demonstrated against that baseline. This requires client cooperation and some contractual scaffolding, but it is the foundation of any credible value demonstration.
Without a baseline, every claimed improvement is an assertion. With one, it is evidence.
Part V: The Future of T&M Engagements
The long-term pressure on pure T&M is real and structural. The question is not whether the model will evolve, but at what pace, in which direction, and with what consequences for firms that do or do not adapt.
The Models on the Horizon
Rate-Plus-Tooling (Transitional) The most immediate adaptation is to maintain T&M billing but add an explicit line item for AI tooling, infrastructure, and governance overhead. This is honest - firms do incur real costs for their AI tooling, Copilot licenses, LLM API access, agentic tooling, and the time spent on AI workflow design - and it introduces the concept of AI-as-a-service component rather than an invisible cost. Clients will push back, but the conversation at least makes AI costs and benefits explicit.
Velocity-Adjusted T&M (Near-Term) Rather than billing purely on hours, firms can introduce a velocity adjustment layer - a baseline expected output rate, measured against agreed metrics, with rate or team-size adjustments based on actual performance. If AI enables the team to exceed baseline velocity by 20%, the firm captures a share of that through a pre-agreed sharing mechanism. This is T&M with performance incentives layered on top - complex to administer, but increasingly contractually viable.
Output-Based / Milestone / Deliverable Pricing (Emerging) Tying payments to delivered work items, project milestones or specific deliverables (features, user stories, etc.) rather than time spent is a more radical departure from T&M, but it is gaining traction. The client pays when features go live or when they are accepted as developer complete or tested (depending on team setup), not when hours are logged. This shifts risk to the services firm - scope must be well-defined - but it also removes the perverse incentive to slow-walk delivery. AI makes this model more viable because it makes estimation more precise for well-scoped work.
Outcome-Based Contracting (The Future) The most sophisticated evolution ties vendor revenue directly to business outcomes the client achieves - a percentage uplift in system performance, a measurable reduction in operational incidents, a quantified decrease in customer support volume following a platform migration. Firms like Zendesk and Intercom have pioneered this in product contexts (charging only for resolved tickets), and the model is spreading into services. The challenge is measurement, attribution, and the need for genuinely airtight contract definitions of what constitutes a successful outcome. McMann & Ransford's research is direct - T&M "by design, punishes efficiency unless growth or pricing evolves alongside it. Clients are increasingly questioning what they're paying for - and how."
Retained Capability Models - a.k.a. Managed Services Rather than project-based engagements entirely, some firms are shifting to retained capability arrangements - a fixed monthly fee for a dedicated, AI-augmented engineering squad with agreed capacity. The client pays for assured access to a capability and a known output rate, not for individual hours. This is closer to a subscription than a T&M contract, and it makes AI productivity gains accretive to the firm's economics - the same retained fee, delivered with greater efficiency, means higher margin.
The Talent Pyramid Will Invert
The engineering talent model that underpins T&M pricing is being restructured. The traditional pyramid - many junior engineers at the base, fewer seniors at the top - will flatten and then invert. AI handles the high-volume, repeatable work that junior engineers have historically performed. Senior engineers, who possess the system thinking, client communication, architecture judgment, and problem framing skills that AI cannot replicate, become proportionally more valuable. Firms that build their commercial model around AI-augmented senior engineers, pricing for expertise and judgment rather than raw hours, will be better positioned than those that try to maintain large junior headcounts in the face of AI compression.
This has significant implications for hiring, career laddering, and the economics of offshore delivery models built on labour arbitrage. As Everest Group's Yugal Joshi summarises - "The timelines of engagements will massively shrink further, impacting billing.".
The Clients Are Changing Too
It would be a mistake to analyze the future of T&M in isolation from client-side change. Enterprise clients are themselves undergoing AI transformation. Many are building internal AI capabilities, training their own engineers on AI tools, and developing AI literacy in their procurement and vendor management functions. The result is a more informed, more demanding buyer - one who knows what AI can do, has benchmarks for AI-assisted engineering velocity, and will not accept traditional T&M pricing for work that is visibly AI-accelerated.
At the same time, the most sophisticated clients are discovering that AI creates new categories of need - private LLM deployments / versioning, AI governance frameworks, data pipeline modernization for AI readiness, model evaluation and observability - that they need trusted services partners to address. Accenture's AI bookings hit nearly $5 billion run-rate in Q1 2025 precisely because they were positioned to capture this new demand. For mid-market and specialist services firms, the equivalent opportunity is to become the trusted AI transformation partner for their client verticals - a fundamentally different positioning from commodity development labor.
Part VI: A New Playbook for Services Firms
None of this is hypothetical. The firms that will thrive through this transition are those that start moving now, before competitive pressure and client demands force a reactive rather than strategic response.
1. Invest in AI Engineering Maturity Before Selling It
The Faros AI research is instructive - AI productivity gains require five enabling factors - workflow design, governance, infrastructure, training, and cross-functional alignment. Companies that arm their engineers with AI tools without addressing these systemic enablers will see the bottlenecks shift without the gains materializing. The investment in AI maturity has to precede the commercial promise.
2. Build Measurement Infrastructure
You cannot demonstrate value you cannot measure. Instrument your delivery pipelines - DORA metrics, AI usage telemetry, code quality analytics - and create client-facing dashboards that tell the story of AI-augmented delivery. This is both a sales tool and a risk management mechanism - if AI usage is visible, so are the quality safeguards around it.
3. Redesign Commercial Conversations
Stop leading with headcount and hours. Lead with outcome commitments, velocity benchmarks, and quality guarantees. Even within a T&M structure, the narrative should be about what clients get, not how many people they are paying for. This requires retrained account managers and delivery leads - people who can speak fluently about DORA metrics, AI ROI, and value streams.
4. Pilot Alternative Models on New Engagements
Waiting for the perfect commercial framework before experimenting is a recipe for being outpaced. Pick three or four new engagements and pilot milestone-based or velocity-adjusted models alongside them. Learn what measurement challenges arise, what client pushback occurs, and what contract language is needed. Build organizational muscle before the whole business needs to transition.
5. Differentiate on AI Governance and Trust
Code quality risks from AI are real and growing - rising code churn, architectural incoherence, and security vulnerabilities embedded in AI-generated code. Firms that invest in rigorous AI code review standards, prompt engineering quality control, and transparent AI usage disclosure will differentiate on exactly the dimensions that nervous enterprise clients care about most. "We use AI responsibly" is not marketing - it is increasingly a procurement requirement.
6. Shift Up the Value Stack
The most durable response to AI commoditizing execution is to invest in the parts of the services value chain that AI cannot (yet) replicate - ownership, successful software delivery know-how and management, system architecture, stakeholder alignment, change management, product thinking, and strategic technology advice. These are the high-value activities that justify premium rates and make a services firm genuinely hard to replace. The T&M model for these activities may persist longer than it does for execution, precisely because the value is knowledge and judgment rather than throughput.
Conclusion: The Reckoning Is Already Here
The question facing software services firms operating on time-and-materials models is not whether AI will disrupt their commercial model. It already is. The question is whether they will be the agent of that disruption - capturing AI gains through smarter commercial structures, deeper client value delivery, and reinvented service propositions - or its victim, watching margins compress as clients demand faster, cheaper delivery without any framework to capture the value of the AI investment.
The productivity paradox at the heart of this moment is real - AI is everywhere in software engineering, yet most organizations see no measurable system-level gains in delivery or at least cannot verify them with data. The gains are being absorbed into review bottlenecks, quality remediation, and AI-induced technical debt. For services firms, this paradox contains a hidden opportunity. The client who is struggling to capture or measure AI productivity gains from their internal teams has a problem. A services firm that has solved that problem - that has built the workflow, governance, and tooling infrastructure to reliably convert AI augmentation into delivery improvement - has something genuinely worth selling.
That is the transformation available to T&M firms willing to make it. Not just deploying AI tools, but building the engineering system that makes AI tools deliver. Not just billing for hours but charging for the outcomes those hours produce. Not just surviving the disruption of AI but becoming the partner clients need to navigate it.
The billable hour is not dead. But it is no longer a safe business model for the future.
Sources
- Gartner — Don't Limit AI in Software Engineering to Coding
- METR — Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
- arXiv — METR Study Paper (2507.09089)
- DevOps.com — AI in Software Development: Productivity at the Cost of Code Quality?
- Faros AI — The AI Productivity Paradox Research Report
- Addy Osmani / Substack — The Reality of AI-Assisted Software Engineering Productivity
- iDevNews — Gartner: AI-Augmented Development Hits Radar for 50%+ of Enterprises
- InfoQ — AI Is Amplifying Software Engineering Performance, Says the 2025 DORA Report
- Index.dev — AI Coding Assistant ROI: Real Productivity Data 2025
- Simon-Kucher — Generative AI and the Price Model Revolution in Professional Services
- Simon-Kucher — Transform Your Revenue Model with Outcomes-Based Contracting
- Rest of World — Will AI Kill Indian IT? The $300B Billable Hour Reality Check
- Asia Financial — Anthropic's New Tools Show AI Risk to Indian IT Services Revenues
- CIO.com — AI Workflow Tools Could Change Work Across the Enterprise
- Dawn Capital / Medium — AI × IT Services: Europe's €200 Billion Industry
- Managed Services Journal — 3 IT Service Trends That Changed the Game in 2025
- McMann & Ransford — The Death of T&M: The Inevitable Future of Pricing
- CloudZero — The State of AI Costs in 2025
- Codebridge — Software Development Outsourcing Rates 2026: Costs and Trends
- Rezoomex — The Shift to Outcome-Based Pricing in Technology
- L.E.K. Consulting — The Rise of Outcome-Based Pricing in SaaS
- DealHub — What is Outcome-Based Pricing?
- Pragmatic Institute — Understanding Outcome-Based Pricing
- Getmonetizely — Outcome-Based Pricing: The Next Frontier in SaaS?
- EY — SaaS Transformation with GenAI: Outcome-Based Pricing
- Saigon Technology — Fixed Price Software Development (2026): Comparison With Time and Material
- Above the Law — Will AI Really Move the Needle on the Billable Hour?
- Timerewards — Billable vs Non-Billable Hours: Complete Guide 2025
- Mosaicapp — Billable Utilization Rate Statistics in Professional Services Firms
- PwC — 2026 AI Business Predictions
- Marc Nuri — Boosting My Developer Productivity with AI in 2025
- Master of Code — How Does AI Reduce Costs? Save 5–20% Across Operations
- Flexprice — Why AI Companies Have Adopted Usage-Based Pricing in 2026
- Metronome — The Next Big Billing Wars: What AI Companies Will Demand
- Metronome — AI Pricing in Practice: 2025 Field Report from Leading SaaS Teams