On-demand exhaustive AI-analysis
Complete visibility into time & dollars spent
Create meaningful reports and dashboards
Track and forecast all deliverables
Create and share developer surveys
Align and track development costs
Every morning at 9:15, your two-pizza team gathers for standup. Eight people arranged in a loose circle—three engineers, two QA, a designer, a product manager, and a tech lead. One by one, they share what they did yesterday, what they're doing today, and what's blocking them. The ritual takes twelve minutes, occasionally stretching to fifteen when someone gets into the weeds on a tricky bug.
This scene plays out in software companies around the world, largely unchanged since Jeff Bezos popularized the "two-pizza team" concept at Amazon in the early 2000s. The principle is elegant: teams small enough to be fed by two pizzas (5-8 people, roughly) stay agile, communicate effectively, and avoid the coordination overhead that bogs down larger groups. The science backs it up—Harvard research shows that individual satisfaction and productivity decrease as team size grows, a phenomenon called the Ringelmann Effect. Martin Fowler codified it. Countless startups live by it.
But here's what's changing: when AI writes most of the code, the bottleneck shifts. And when the bottleneck shifts, the shape of the team has to shift with it.
We've constructed an entire infrastructure of processes, rituals, and feedback loops around a fundamental constraint: writing code takes time.
Two-week sprints. Planning poker. Story points. Code reviews requiring two human approvers. Daily standups. Retrospectives. These aren't arbitrary ceremonies—they emerged because coordinating human developers on complex software requires structure, and because the pace of development naturally fell into rhythms that humans could sustain and customers could absorb.
John Cutlefish wrote a fascinating piece years ago about one-day sprints—the idea that a team could plan, build, and demo something valuable in a single eight-hour cycle. "Have The Conversation. Do The Thing. Review. Go Home." Most practitioners dismissed it as impractical, a thought experiment at best. The overhead of the Scrum events alone would eat up half the day. When would you groom the backlog? When would QA actually test anything?
The conventional wisdom was right—for the constraints we had. When writing code is the rate-limiting step, one-day sprints don't make sense. The cost of context-switching and coordination exceeds the value of the shorter feedback loop.
But what happens when writing code stops being the rate-limiting step?
When AI generates the bulk of your code, the production bottleneck evaporates almost overnight. What remains are the things that were always constrained but hidden behind the more obvious limit of developer output:
**How fast can you define the problem?** Turning a vague customer need into a precise specification that AI can execute is surprisingly difficult. It requires understanding both the domain and the technical constraints. Most feature requests arrive as "make it do X better" or "users are confused by Y." Translating that into actionable requirements is a human skill that doesn't speed up just because code generation does.
How fast can customers absorb change? There's an immutable psychological reality: humans take time to learn new things. Research on habit formation shows it takes an average of 66 days—not 21, as the myth suggests—for a new behavior to become automatic, with individual variation ranging from 18 to 254 days. You can ship a new feature in an afternoon, but your users won't truly adopt it for weeks or months. Push changes faster than humans can adapt, and you create confusion, frustration, and churn.
How fast can you get meaningful feedback?
Atlassian discovered their manual process for categorizing customer feedback took up to six weeks—by the time insights were ready, product decisions had already been made. Even with AI-powered analysis cutting that to hours, there's a floor: customers need time to actually *use* the feature before they can tell you whether it works. You can't A/B test something that shipped thirty minutes ago.
How fast can the organization absorb change?
Change management research has a name for what happens when you push too much change too fast: *change saturation*. It's the tipping point where employees can no longer absorb any more changes. The symptoms are predictable—disengagement, burnout, rising error rates, attrition. One study found that organizations experiencing change saturation see productivity losses, quality degradation, and significantly higher turnover. You can technically ship faster, but your people can't keep up.
These aren't bottlenecks that disappear with better technology. They're built into human cognition, organizational dynamics, and the physics of feedback loops. They're immutable—or close enough to immutable that they might as well be.
Not everything is fixed, though. Some constraints that feel immutable are actually artifacts of the old system.
Code review bottlenecks. Right now, at Allstacks and many other companies, code requires two human reviewers before it can be merged. This made sense when the alternative was no review at all—and still makes sense for AI-generated code, where studies show 1.4-1.7x more critical issues than human-written code. But the shape of review is changing. AI-powered review tools can provide an initial pass, catching obvious issues before human reviewers spend time on them. The human role shifts from "check for syntax errors" to "evaluate whether this actually solves the problem it claims to solve."
Sprint ceremony overhead. When development velocity was slow, the overhead of planning and retrospective meetings was a reasonable percentage of total capacity. If sprints genuinely get shorter—whether to weekly or even daily cycles—those ceremonies need to compress or merge. The information they capture is still valuable; the format may not be.
Coordination costs. As team size grows, the number of communication links increases exponentially. A team of 6 has 15 links to manage; a team of 12 has 66. This is why two-pizza teams work. But if AI handles much of the implementation, you need fewer people doing the implementing and more people doing the specifying, validating, and orchestrating. The communication patterns change.
Specialist handoffs. Traditional teams have specialists—frontend, backend, DevOps, QA—and work passes between them in a relay race. AI doesn't respect these boundaries. An agent that can write frontend, backend, and tests in a single session eliminates entire categories of handoffs. The generalist who can orchestrate AI across the stack becomes more valuable than the specialist who goes deep in one layer.
Here's what I suspect is coming—not as a prediction, but as a direction:
The two-pizza team becomes the one-pizza team. Not because small is fashionable, but because you simply don't need eight people when AI handles the bulk of code generation, test writing, and even initial code review. The core unit becomes something like: one product manager defining problems, one technical lead orchestrating AI and making architectural decisions, one quality advocate ensuring the outputs actually work.
Some developers become fleet managers. Instead of writing code, they manage a fleet of AI agents. "This agent runs continuously, fixing these types of security issues. That agent triages and reproduces customer bugs. Another handles dependency updates." The skill shifts from writing code to specifying what code should do and validating that the output is correct.
GitHub's CEO describes this as the "strategist" stage of developer evolution. Developers "no longer write code, having delegated that task to AI agents, but focus on refining prompts and reviewing and validating the generated implementation." The work is still technical, still requires deep understanding—but the hands are on a different wheel.
The grumpy senior engineers have a point. There's a real risk in this transition. AI-generated code contains 1.7x more issues than human-written code on average. A study found that after just five iterations of AI code refinement, critical vulnerabilities increased by 37.6%. The researchers explicitly recommend *requiring human reviews between iterations*—the AI can't be trusted to review its own work.
This is why the "army of juniors" metaphor keeps appearing in security research. AI agents can produce large volumes of functional code, but they need senior oversight to ensure that what works is also secure, performant, and maintainable. If you eliminate the seniors, or don't cultivate new ones, you're building on sand.
The cadence changes—but not as much as you'd think. Shorter sprints become possible, but the floor is set by human adaptation. Even if you could ship hourly, your customers can't learn that fast, your support team can't document that fast, and your organization can't absorb that much change. One observer noted "the basic human need of mental reflection/celebration-time between completing a task and starting a new one" as a limit on sprint compression.
The irony is that the technical constraint relaxes while the human constraints become more visible. We've always been limited by how fast people can think, decide, and adapt. We just couldn't see it clearly when writing code was the more obvious limit.
There's a reason code review requires two human approvers. It's not bureaucracy—it's defense in depth. One reviewer might miss something. Two reviewers together catch more. The system assumes humans make mistakes and builds in redundancy.
What happens when AI writes the code and AI does the first review pass? You've removed a layer of defense. The human reviewers are now reviewing AI-reviewed AI-generated code, which means they're partially trusting the AI's judgment about the AI's work.
One security study found that "AI lacks local business logic—models infer code patterns statistically, not semantically." The code looks right but doesn't actually understand the rules of your system. Another found that AI-generated code "often appears clean and functional but hides structural flaws that can grow into systemic security risks."
The problem isn't that AI-generated code is bad. The problem is that it's confidently wrong in ways that are hard to spot. And if you're moving faster—because you can—the errors compound faster too.
Then there's the knowledge pipeline problem. If juniors aren't writing code because AI does it cheaper, and they're not reviewing code because there's too much volume, how do they develop the intuition that makes them valuable seniors? AWS CEO Matt Garman called the trend of not hiring juniors "one of the dumbest things I've ever heard"—because in ten years, there's no one who's learned anything.
I don't think teams will shrink to one person managing an army of AI bots. The human elements of software—understanding what to build, why to build it, whether it actually solves the problem—remain stubbornly human.
But I do think teams will get smaller. The two-pizza team won't disappear; it will compress. The roles will blur—less "I'm a frontend engineer" and more "I orchestrate AI to solve customer problems across the stack." The rituals will adapt—faster cycles, different review patterns, new kinds of standups focused less on "what did you code" and more on "what did you validate."
The biggest shift isn't technical. It's recognizing that the constraint was never really the code. It was always the humans—our ability to define problems clearly, to adapt to change, to catch errors, to learn from feedback. AI removes the excuse we used to hide behind. Now we have to confront the real limits.
Some of those limits are immutable. Customers need weeks to adopt new features. Organizations need time to absorb change without saturating. Feedback loops have a floor set by the physics of human attention and behavior.
Some of those limits are changeable. Review patterns can evolve. Coordination overhead can compress. Specialist handoffs can disappear. Sprint ceremonies can adapt.
The teams that thrive will be the ones that figure out which is which—and build their processes around the constraints that actually matter.
The bottleneck isn't the code anymore. The question is: what is it now?
---
Jeremy Freeman is the CTO of Allstacks. He thinks a lot about what engineering teams look like when the easy parts get automated—and what stays difficult.