← All thinkingAI SystemsApr 20268 min

Scoping AI systems the way operators scope everything else

Most AI system scopes fail because they start from the model. We scope them from the workflow, then the model. Here's the template we use.

Every AI build we've shipped that mattered started the same way: a 90-minute call where nobody mentioned a model.

We talked about the workflow. Who touches the work today. What they spend their time on. What breaks when volume doubles. What the operator wishes could happen automatically but hasn't figured out how.

The model comes later. Sometimes it's GPT-5. Sometimes it's a Whisper + Claude pipeline. Sometimes it's a dumb regex and a cron job. The workflow tells you which. If you start with the model, you end up building something that's technically impressive and operationally useless.

Our scoping template has four sections: operator intent, workflow map, failure modes, and success state. Every AI scope we quote goes through this frame before anyone opens a laptop.

Operator intent: what does the operator actually want to happen, in plain language, without mentioning AI. 'I want every lead called within five minutes of submission.' 'I want my receptionist not to pick up the phone unless she has to.' 'I want my ops lead to stop doing the weekly report.' If you can't write this sentence, you don't have a project yet.

Workflow map: draw the current workflow as a sequence of human handoffs. Every arrow is a place where work transfers hands, context is lost, or time is spent. Circle the arrows that take the longest. AI goes on the arrows, not the boxes.

Failure modes: list five ways the AI version fails. Not hypothetically — the actual ways real operators have been burned. 'What happens when the agent gets a question it doesn't know?' 'What happens when the model is down?' 'What happens when the data it was trained on is six months stale?' The scope has to account for each.

Success state: describe the workflow after the build ships. Specifically. Quantitatively if possible. 'Ops lead does 2 hours of review per week instead of 10.' 'Voice agent handles 60% of calls without handoff.' 'Response time drops from 45 minutes to 2 minutes.' This is the thing you benchmark against 30 days after launch.

Nine times out of ten, this process surfaces that the real project is smaller than the operator thought. The tenth time, it reveals that the project is much bigger — and the operator should be glad they scoped it before committing six figures to the wrong thing.

— Joshua Black / Michai MediaNext piece →