Engineering Leadership

Running an AI-Native Engineering Org

What changes when agentic coding becomes the default. A walkthrough of the new processes, the killed processes, and the metrics that actually matter — based on how Anthropic's Claude Code team operates.

0 / 6 sections
You've explored all six shifts. Now go find your noisiest workflow and ask if it's still serving its purpose.
1
Just-in-Time Planning

The six-month roadmap is mostly dead. When prototypes can be built in an afternoon, the cost of writing a design doc that imagines what users want often exceeds the cost of just building it and putting it in front of someone.

The new ritual: just-in-time decision making. Plans shift from design docs to discussions inside PRs and prototypes.

From Fiona Fung: "Our planning ritual shifted away from design docs toward discussions in PRs or prototypes."
Interactive · Planning loop state machine

Click a state to walk the JIT loop. Green = valid next steps. Dim = invalid jumps.

Click any state to start the loop.
Why does long-horizon planning lose value in an AI-native org?
2
Context Gathering — Claude First

The default move "go find the person who wrote it" is no longer optimal. The new norm: query Claude first with a specific question, and only escalate to a human when the model's answer hits its limit.

This isn't just a speed win — it reframes what humans should be asked. Stop interrupting senior engineers with questions Claude can answer from the repo. Save their attention for context Claude doesn't have.

Interactive · The two context-gathering loops

Toggle to see how the same investigation ran before and after.

Press Play to walk the loop.
The deeper question: "What do you actually need to know?" Many questions humans get asked are really just requests for synthesis — and synthesis is exactly what Claude is good at.
When does it still make sense to interrupt a human engineer?
3
Code Review — Recalibrating Human Judgment

Code review is one of the most expensive engineering rituals. Claude is now a first-pass reviewer that handles the mechanical work — style, linting, common bug patterns, draft PR feedback — leaving humans to focus on the parts where their judgment is irreplaceable.

The rule of thumb: if a review comment could have been generated from the diff alone, Claude should generate it. If it requires understanding people, product, or risk, that's a human's job.
Interactive · Sort review tasks by who owns them

Use the buttons on each task to route it to Claude or humans. Try to match the team's actual split.

Claude handles
Humans focus on
Unassigned — click to route each task
Assign all six tasks to see how your split compares.
Which review concern should a human still own end-to-end?
4
Team Composition — Blurred Roles

When generating code is no longer the bottleneck, the value of pure "coders" drops and the value of creative builders rises. Roles overlap; the silos thin out.

PMs ship code. Engineers ship copy and design. Designers prototype with working data. Hiring shifts: deemphasize raw coding throughput, emphasize product intuition and systems depth.

Interactive · How the roles bleed into each other

Click a role to see which tasks they now own that they didn't before.

Pick a role above
Each role's expanded surface area is shown when selected.
Hiring shift: "PMs code a lot now, and engineers take on things like design and content." The implication is that you should hire for product intuition and systems depth — coding output alone is no longer scarce.
What changes about hiring priorities in an AI-native org?
5
Three Core Team Principles

The mechanical processes change, but they're held in place by three operating principles. Drop any one of them and the rest stop working.

Interactive · The three principles
Everyone uses the product daily as their default tool — not as a demo, not as a side project.
  • Every team member, including managers, drives Claude Code through real work every day.
  • "100% Claude-assisted commits" is treated as the floor, not the goal.
  • Pain points surface in your own backlog before they reach a customer.
"Relentless dogfooding" is the cheapest QA signal an AI org has.
Managers start as ICs and stay close to the work. Flatness keeps decisions fast and managers calibrated to reality.
  • Managers begin as individual contributors before taking on people leadership.
  • They keep enough hands-on time to feel the actual engineering experience.
  • Information flows in fewer hops — fewer translations means less signal loss.
"Managers begin as individual contributors" — so they understand the experience they're managing.
Every team member has explicit permission to question and kill processes that no longer serve their purpose.
  • Status meetings, ritualized planning, gatekept reviews — anything is on the table.
  • The test: "Is this still serving its purpose? Can we automate it?"
  • If the answer is no twice in a row, the process dies — without needing a committee.
"Identify your noisiest workflow and ask whether it still earns its cost."
What's the operational test for keeping a process around?
6
Metrics & Where Bottlenecks Move

Removing the typing bottleneck doesn't remove bottlenecks — it moves them. The new metrics watch where pressure shows up: how fast new hires ship, how long PRs take to merge, and how much of the work has Claude in the loop.

Hit "Measure" to see the relative pressure on each metric in an AI-native org.

Interactive · Where the new bottlenecks emerge
Onboarding ramp
PR cycle time
CI / build infra
Claude-assisted commits
Read it this way: Onboarding and Claude usage are healthy. PR cycle time creeps up as code volume rises. Build infrastructure is where the real strain lands.
When code generation gets cheaper, where does the bottleneck most often move next?
Learning Reference · Running an AI-Native Engineering Org — Fiona Fung, Anthropic