Is Spec-Driven Development a Step Forward or Back for Product Development?
Spec-driven development looks like a step backward if you think of it as requirements theater. But the better frame is that the spec is becoming a higher-level programming language for human intent.
Click image to open full size What is Spec-Driven Development (SDD) and How Does It Fit with Agile?
Spec-driven development looks like a step backward if you think of the spec as a document that sits between business people and developers. In that model, a detailed spec is requirements theater with more Markdown. It invites mini-waterfall behavior: write a lot, hand it off, let the builders build, and review the mess later. I think that is the wrong mental model for what is happening.
The better frame is that the spec is becoming a higher-level programming language. Humans will increasingly maintain intent, constraints, acceptance criteria, examples, context, and learning strategy at the spec layer, while AI agents do more of the lower-level coding work. That does not mean Scrum needs detailed specs in the Product Backlog, refinement, or Sprint Planning. It means the team aligns around goals and context there, then develops the spec during the Sprint as part of doing the work.
What happens to our development lifecycle when we adopt AI for coding?
I’m seeing engineering organizations go in a couple of different directions with their iterative agile lifecycles as they adopt AI-spec-driven development. (Think Scrum, SAFe). Most of the time, they either try to force-fit spec-driven activities into their existing Scrum process, or they throw away all of the process and start from scratch.
Case in point - A Scrum community thread recently raised a question I expect many teams are asking in different words: if AI is doing more task decomposition, and specs are detailed enough that they almost describe the code AI will generate, what happens to Product Backlog Refinement, Sprint Planning, and Scrum itself?
The concern is fair. Some teams are seeing refinement become disengaging because one person, sometimes with AI, arrives with a detailed breakdown before the conversation starts. Some teams are spending less time estimating because the work is sliced smaller or because the implementation effort feels less meaningful when AI is in the loop. Some people look at the flow and see docs first, execution second, review later, and they reasonably ask whether we have recreated waterfall with better tools.
I look at the whole spec-driven thing differently.
Spec-driven moves your development lifecycle one level up
In my view, the intent of spec-driven development is not to create a heavier ticket before coding starts. It is that the spec becomes the code layer that humans maintain most of the time. Think about the historical stack: machine code, assembly, C, higher-level languages, low-code, no-code. We moved upward because each new layer let humans express intent at a more useful level of abstraction, while still allowing lower-level work when the situation demanded it. We did not stop caring about lower layers. We just stopped asking most people to live there every day.
AI-assisted software development is pushing us through another layer shift. Much of our “code” will be managed through spec-driven interactions. There will still be lower-level coding. There will still be situations where someone has to inspect the generated code, tune the architecture, write a tricky algorithm, debug a production issue, or work close to the metal. There are still people writing C in the Linux kernel, even though most business software has moved far above that layer. But for a growing share of product work, the human programming surface is becoming the spec.
Coding the spec is not a burden
This is why I do not buy the complaint that writing the spec is just a new administrative burden. It can become one, absolutely. Bad process can turn anything into a burden. But the useful version of spec-driven development is not “write more documentation before the real work.” The useful version is “program the system at the level where human judgment matters most.”
The spec is where we express the goal, intent, context, constraints, trade-offs, acceptance criteria, examples, risks, assumptions, and the signals we will look for. That is not paperwork. That is the work. It is the programming language of human intent.
The important shift is that the spec is not a proxy for requirements being fed to passive builders. It is the place where builders, product people, designers, domain experts, and agents collaborate to shape intent into a working outcome. That is a very different thing from a business analyst writing a detailed requirements document and throwing it over the wall.
When Should We Do Spec-Driven Development? Where does it fit in an Agile/Scrum Lifecycle?
The next mistake is trying to force this new spec layer into classic Scrum artifacts at the wrong altitude.
Do we need to specify before adding something to our backlog?
A Product Backlog does not need to contain detailed specs. A Product Backlog item does not need to include the full spec that an AI agent will use to build. During refinement, and even during Sprint Planning, the useful level is usually the same level good teams already aimed for before AI: what is the goal or intent, what are the rough acceptance criteria, what context matters, what leading indicators would tell us this is worth doing, and what is the next sensible slice?
Should developers expect a ready spec before starting to build something?
The argument around “ready” is decades old at this point. One side argues that the development factory can do twice the work at half the time if you focus on ready inventory. And that is true. But do you really want twice the work? The other side of that argument is that if you care about outcomes not output, you should be ok with discovery that happens closer to the work, even though it creates some messiness. You might think about it as value trumps flow/waste elimination.
With that in mind - the answer in the spec-driven world depends on what you’re trying to optimize for. If you’re trying to optimize for output, maybe you should focus on ready specs before building. If you’re building in an environment of opportunity and uncertainty, you’re better off allowing the spec to emerge and evolve iteratively.
In this environment, a clearly articulated goal is enough to have a refinement conversation. It is enough to prioritize. It is enough to decide what is worth pulling further.
Yes, It is not enough to drive an AI coding agent all the way through a meaningful implementation, and that is fine. It was never supposed to be.
One of the healthiest things AI can do for Scrum teams is force them to revisit what refinement is actually for. Refinement is not a ceremony where the team must collectively decompose everything into tasks. It is not where the team has to manufacture detailed estimates. It is not where the entire implementation approach has to be agreed in advance.
Useful refinement creates shared understanding of the business problem, the product intent, the shape of the option, and the reasons this might or might not matter. It should help the Product Owner and Developers inspect whether the item is worth keeping, whether it is small enough to pull soon, whether the acceptance criteria are directionally clear, and whether there is enough context for a team to make progress.
If a team is using refinement to review a giant AI-generated task breakdown, I would not be surprised if engagement drops. Most of that detail can be figured out at code-writing time. Worse, that detail can crowd out the conversation humans should be having: what are we trying to achieve, what might we be wrong about, what are we choosing not to do, and what would make this a bad bet?
The Product Backlog should carry intent and context. The spec-driven loop should elaborate the work at the moment the work is being developed. These are different altitudes. Mixing them creates process mud.
So when DO we /Specify?
The detailed spec belongs closer to execution. Once the work starts, a human team member, or a pair, or a small swarm, pulls a goal (e.g. from the Sprint Backlog or a kanban ready queue) and specifies, plans, and builds in collaboration with AI, using the spec as the evolving “code”. That is when the real spec-driven cycle goes into full gear. It may involve one person and one agent. It may involve several specialized agents. It may involve two people pairing with an agent. It may involve the whole team swarming a larger item because the risk or importance warrants it.
This is the atomic work that happens day in and day out. You don’t need a process like Scrum or Kanban to delve too much into these steps. Like you didn’t micromanage each test case in a TDD session, or the microtasks for a developer/engineer as they’re approaching how to build something.
What should we do in our Sprint Planning once we move to Spec-Driven Development?
Is Sprint Planning where we break down all of the specs into tasks and estimate them? NO.
The goal of Sprint Planning is not to finish all thinking before the Sprint starts. It is to decide why this Sprint is valuable, what can be done, and how the selected work will initially be approached. The “how” can be quite lightweight, especially when the team has the ability to elaborate specs rapidly during execution.
So, can we cancel our Sprint Planning sessions when shifting to spec-driven development?
That does not mean planning becomes irrelevant. If anything, AI makes focus more important. AI does not unlock the dream of doing everything at once. It makes it easier to start too many things, create too many branches of work, generate too much code, and flood the review and adoption parts of the system. A Sprint can still be very useful as a focus mechanism: this is what we are aiming at now, and that means we are not doing all the other plausible things yet.
The Sprint Backlog can contain a couple of bigger outcome-oriented items, several smaller ones, or one large bet the team intentionally swarms around. The team can decide in Sprint Planning how much collaboration each one deserves. Maybe one person takes the lead and brings others in at decision points. Maybe two people pair with an agent because the architecture is sensitive. Maybe the team divides and conquers after agreeing on boundaries. Maybe they swarm because the risk is high or the learning matters.
That is team design work. Scrum gives the container. It does not need to micromanage the spec cycle inside the container.
Inspecting and Adapting Continuously
When a human-agent pair is moving quickly, important learning can show up mid-Sprint. The agent discovers that the existing architecture does not support the intended direction. The spec exposes a hidden assumption about the workflow. A prototype makes the original idea look less valuable than a nearby option. A technical constraint suggests a different slice. Customer feedback changes the bet.
Waiting a week or two to discuss that can be silly. This is where the Daily Scrum can become useful again, not as a status meeting, but as a steering point. What did we learn? Is our plan for the Sprint Goal still coherent? Do we need a product decision, an architecture conversation, a customer touchpoint, or a team swarm? Is the work moving toward the intent, or are we just generating output?
AI makes this more important because things move faster. A bad direction can produce a lot of plausible code quickly. A good learning loop can redirect quickly, too. The Daily Scrum is one of the places where the team can keep the learning loop visible without turning every spec into a whole-team meeting.
Now, does that have to be a meeting? Do we have to wait until that specific time in the day to talk through stuff? Of course not. Some of the most effective teams I’m seeing are leveraging co-location, continuous availability, and osmosis, while doing some of the most awesome AI-native coding I’ve seen. But if that’s not reality, a structured opportunity to check in with each other, realign, and respond to emerging learning, can still be useful.
The Sprint Review matters more, not less - But it focuses on Goals and Outcomes, not details of the work done
At the end of the Sprint, the team needs to step back. That is no less important because AI helped ship more. It is more important because AI can create acceleration whiplash: the inner loop of building accelerates faster than the outer loop of deciding, adopting, measuring, and learning.
The Sprint Review is when the team and stakeholders review what was shipped, what was learned, and what might make sense next. Did anyone use it? Did it change behavior? Did it move the leading indicator we cared about? Did it reduce the constraint we were aiming at? Are we closer to the Product Goal, or did we just produce impressive activity?
This is where the “spec is the new programming language” idea has to stay connected to value. A beautiful spec that generates working software is still not the finish line. The finish line is learning whether the product, workflow, customer, or business is better off. Scrum’s review cadence is a useful forcing function for that conversation, especially when AI-assisted teams can generate more work than the organization can absorb.
In my head - the way to shift AI activity to impact goes one step further beyond spec-driven development towards goal-driven development.
Continuously improving everything - How we work, How we organize, How we interact with AI, Everything
The Retrospective also remains highly relevant, but the questions have changed. What did we learn about our collaboration model? Where did AI help? Where did it create noise? Where did we over-specify? Where did we under-specify? Which agent instructions worked? Which review patterns caught problems early? Where did handoffs slow us down? Where did we create too much work in process? What should we retire, try, or tighten?
The bottleneck may no longer be coding. It might be review, product judgment, customer access, data quality, adoption, release safety, or decision-making. The Retrospective is when the team can inspect the entire operating system rather than only ask whether the Sprint felt good.
If the team discovers that specs are becoming too detailed too early, change that. If one person plus AI is doing all the breakdown and the rest of the team is losing shared understanding, change that. If every backlog item is turning into a giant spec, change that. If AI is helping the team move faster, but the organization cannot use what ships, change that.
Who should be writing these Specs? What’s the role of Product ownership in a spec-driven development world?
Specs are like code. Team members should be responsible for specifying, planning, and building with their AI. I’ve seen teams that expect their product managers/owners to write specs, but to me that makes very little sense. It’s like asking product professionals to write acceptance tests or detailed user stories. It leads to disempowered team members and tactical product professionals with a vacuum in strategic product leadership.
I see product roles as focused on overall product direction: where we play, how we win, what outcomes matter, what tradeoffs we are making, and where the next best bet might be. That person was never supposed to be a proxy, a firewall, or a requirements vending machine.
The builders on the team should go as close to customers and users as possible. They should understand the business context, the product strategy, the constraints, and the outcome they are trying to create. They are not fed requirements. They are not fed specs. They are aligned to an intent and surrounded by context. So are their agents. (Btw, this is where concepts like Forward-Deployed Engineering come into play - shifting engineering to the front lines rather than hiding and protecting them behind the closed doors of the typical way organizations manage their product lifecycle.)
Do we need a new framework to replace Scrum/SAFe? Do we need a special version for AI-native development lifecycles?
As a member of the inner circles of both the Scrum and SAFe communities, I’m seeing a raging debate about how to adjust/rethink for the native AI world. I don’t know where each community will go.
Here’s my take. I do not think teams need a special version of Scrum or SAFe. What they need is to strip these frameworks back to their kernel. Empiricism. Alignment to Outcomes. Focus. Transparency. Inspection. Adaptation. Commitment to goals. Self-management.
Then look at the complementary practices around that kernel. Estimation? Optional. Story points? Optional. User stories? Optional. Detailed task breakdown in Sprint Planning? Optional. Whole-team refinement of implementation detail? Optional. If those practices help, keep them. If AI-assisted spec development has made them low-value or actively harmful, retire or reinvent them.
Kanban can help here, too. Not as Scrum’s enemy or replacement (although I’m seeing more and more teams thinking that way), but as a way to see the flow. Visualize where specs are emerging, where agent work is in progress, where review is backing up, where adoption is stalled, and where validation is missing. Limit work in process. Manage flow. Make blocked learning visible. Scrum and Kanban have always been more compatible than the framework debates made them sound. AI is doing for Scrum what Kanban did for many Scrum practitioners a decade ago. It is helping people see the essence by making the accumulated baggage more obvious.
The opportunity in unreasonable agility
The best version of this future is not one where AI writes all the code and humans review a pile of generated output with a grim sense of duty. It is also not one where humans write giant specs and agents mechanically execute them.
The best version is a team with agency. Humans and their agents aligned around outcomes. Backlogs that carry intent instead of fake certainty. Specs that are developed at the right altitude, at the right time, by the people closest to the work, assisted by their agents. Reviews that inspect whether real value was realized. Retrospectives that improve the human-agent operating model. Product leadership that provides strategic context instead of feeding requirements.
That team can move very fast without pretending certainty. It can use specs as a programming language for intent, not as a wall between thinking and doing. It can let AI accelerate the lower layers while humans get better at the higher ones: choosing, framing, steering, sensing, and learning. That is not abandoning agility. That is what agility looks like when the programming surface moves up a layer.
The spec is not the paperwork before the work. Increasingly, the spec is the work humans maintain so people and agents can build, learn, and steer together.
Practical thinking on turning AI pilots, adoption, and portfolio work into business impact - by finding the constraint, changing the work, and proving value as you go.
Yuval Yeret helps product and tech leaders move from agile theater to evidence-informed delivery. Work with Yuval →