Anthropic's advisor strategy: delete your custom model routing code

For the last few months we have been trying to solve what feels like a really basic question: how do you get frontier-model reasoning only when you need it, without paying Opus prices for every single agent turn.

We tried a bunch of things. A custom router in front of the model to classify “is this hard.” An orchestrator-worker setup where a big model plans and smaller ones execute. Manual planning steps before each run.

All of it kind of worked, none of it felt right. Lot of glue code, lot of edge cases, and every time we onboarded someone new to the codebase they would ask “wait, why is this so complicated” and honestly, fair question.

What Anthropic shipped

Anthropic basically just shipped the thing we were trying to build. It is called the advisor strategy and the summary is: Sonnet or Haiku runs your agent loop normally, and when it hits a decision it cannot reasonably solve, it invokes Opus as an “advisor.” Opus reads the shared context, writes a short plan (usually 400-700 tokens), and the executor picks back up. All inside a single API request.

The executor decides when to escalate. You do not route anything yourself.

The numbers

What got me is the results. Sonnet + Opus advisor scores higher on SWE-bench Multilingual than Sonnet alone and costs about 12% less per task. Haiku + advisor more than doubles its BrowseComp score. None of that requires you to write any orchestration code. It is a single tool entry in your Messages API request.

What this means practically

If you are in the same boat we were, building custom tiering logic, juggling sub-agents, trying to figure out when to call which model, you can probably delete a lot of that code now.

I would recommend just running your existing eval suite against three configs (Sonnet solo, Sonnet + advisor, Opus solo) and letting the data decide. The results will likely surprise you in terms of where the cost-performance sweet spot actually lands.

ai architecture agents