The Model Is Only 10%: The Real Lesson of the New SDLC

📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper emphasizes that in AI software development, the model itself is only about 10% of the system’s behavior. The majority depends on harness, context, and configuration, redefining best practices and strategic focus.

Google’s latest whitepaper, “The New SDLC With Vibe Coding,” argues that the most significant shift in software engineering isn’t a new language or framework but a move towards expressing intent and trusting machines to generate working software. The paper emphasizes that the model itself constitutes only about 10% of the system’s behavior, with the rest driven by how it is configured and controlled.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, reports that as of early 2026, 85% of professional developers use AI coding agents regularly, with 51% using them daily. Additionally, roughly 41% of all new code is AI-generated. The core insight is that the model—the AI itself—is only a small part of the overall system. The majority of influence stems from the harness—the prompts, tools, rules, and observability layers surrounding the model.

At a glance
reportWhen: published March 2026
The developmentGoogle’s new whitepaper highlights that the core of effective AI development lies in harness and context engineering, not just the AI model size, marking a shift in SDLC strategies.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

The Shift Toward Configuration Over Model Size

This revelation shifts the strategic focus from acquiring larger or more advanced models to investing in harness development—the configuration, tooling, and context management that determine AI behavior. It suggests that organizations can achieve better results by optimizing their system architecture rather than solely chasing the latest model improvements. This approach impacts cost, security, and operational efficiency, especially given that most failures are configuration-related.

Amazon

AI prompt engineering tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background of AI Development and Evolving Practices

Historically, AI development emphasized model size and training data as primary drivers of performance. However, recent trends show a shift toward system integration and configuration management. The whitepaper builds on earlier discussions about vibe coding and agentic engineering, emphasizing a spectrum of AI workflows from quick prompts to fully structured, verified systems. This evolution reflects a broader understanding that effective AI deployment depends more on system design than on the raw capabilities of the underlying models.

“The model is only 10% of what determines behavior; the harness is 90%. The behavior you experience is dominated by scaffolding you can build, own, and improve.”

— Addy Osmani

Amazon

AI observability and monitoring software

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unresolved Questions About Implementation and Impact

While the whitepaper emphasizes the importance of harness and context, it does not specify how organizations should best structure these components at scale or how quickly these practices can be adopted across different industries. The precise economic benefits and security improvements from shifting focus remain to be empirically validated in diverse real-world settings.

Amazon

AI development configuration tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Teams and Developers

Organizations should evaluate their current AI workflows, emphasizing system architecture and configuration management. Developing best practices for harness design, context engineering, and testing will be critical. Future research and case studies are expected to demonstrate how these shifts impact cost, security, and performance over time, guiding industry-wide adoption.

Amazon

AI testing and evaluation frameworks

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

The whitepaper explains that the majority of an AI system’s behavior is influenced by how the model is integrated, configured, and controlled through prompts, tools, and rules, which constitute the harness.

How does this shift affect AI development costs?

Focusing on harness and context engineering can lower ongoing operational costs and improve security, as configuration failures are a primary source of errors and vulnerabilities, despite the initial higher investment in system design.

What does this mean for choosing AI models?

Model choice remains important, but the whitepaper emphasizes that the real value and performance come from how the model is embedded within a well-structured system, not just the model’s raw capabilities.

Is this approach applicable to all AI applications?

While the principles are broadly relevant, the extent of their impact depends on the specific use case, system complexity, and organizational capacity to implement advanced harness and context management.

What should organizations do now?

Start assessing existing AI workflows, invest in system architecture, and develop expertise in harness and context engineering to optimize AI performance and cost-efficiency.

Source: ThorstenMeyerAI.com

This content is for general information only and is not financial, tax or legal advice. Consult a qualified professional for decisions about your money.
You May Also Like

Cross-platform buyer history for multi-marketplace resellers

Resellers on eBay, Poshmark, and Mercari are testing a manual cross-platform buyer history system to improve customer insights and decision-making.

The prospectus. Where the AI labs’ singular governance history meets the auditor.

OpenAI is preparing to file its IPO prospectus, exposing its complex governance history and legal challenges, which could influence investor perception and valuation.

$965B and Climbing: Anthropic’s Series H Is Really a Compute Bet

Anthropic closes a $65B Series H funding round at a $965B valuation, emphasizing compute capacity over valuation growth, with strategic chipmaker partnerships.

A War Room for Your Next Idea: Inside IdeaClyst

Discover how IdeaClyst provides founders with a local-first, AI-powered war room to validate ideas, simulate debates, and make data-driven decisions securely on their own machine.