How to Use Claude Sonnet 5 With Other LLMs: Model Routing, Use Cases, and Workflows

TL;DR

Claude Sonnet 5 is best understood as a practical agentic workhorse, not a miracle model. It looks strongest for medium-effort coding, research, and tool-using workflows, while Reddit feedback is more skeptical about high-effort pricing, creative tone, and whether it should replace Opus 4.8.

Sonnet 5 should be your worker, not your religion

Claude Sonnet 5 is useful, but it should not become the only model in your stack. Treat it as the worker that handles a lot of meaningful execution, then route simpler, harder, cheaper, or more specialized jobs elsewhere with intention.

Please resist the urge.

Claude Sonnet 5 is useful because it gives builders a strong blend of coding ability, tool use, reasoning, speed, and price. That makes it a good default for many agentic workflows.

Default does not mean only.

The best AI systems use multiple models. One model routes. Another executes. Another reviews. Another handles cheap repetitive work. Another takes the high-stakes problems. The user sees one product. Under the hood, it is a tiny committee with a token budget.

That is not overengineering. That is what happens when models have different strengths, prices, latencies, and failure modes.

What Reddit teaches us about Sonnet 5 routing

Reddit’s reaction points to a simple routing lesson: Sonnet 5 is not universally loved, but it has a practical lane. Use it where medium-effort execution matters, and avoid forcing it into creative or high-stakes roles where users already see friction.

Developers and AI builders are asking whether Sonnet 5 makes sense at high effort when Opus 4.8 may perform better. Creative users are upset that Sonnet 5 feels colder, more guarded, and less playful than older Claude models. Some power users are worried about tokenization and real job cost.

Underneath the noise, the routing lesson is clear:

Do not use Sonnet 5 for everything. Use it where it wins.

Its likely best lane is medium-effort execution: coding chores, subagents, codebase scanning, first-pass research, tool use, internal automation, and tasks where Opus would be too expensive to run every time.

That is a good lane. Most software work is medium-effort work pretending to be an emergency.

Pattern 1: cheap model first, Sonnet 5 second

A cheap model should handle the boring gatekeeping before Sonnet 5 spends premium tokens. Classification, tagging, routing, and simple extraction are perfect places to save money before handing the real reasoning work to Sonnet in production workflows with real user volume.

Example support workflow:

User sends a support ticket.
Cheap model classifies it: billing, bug, account issue, refund, feature request.
Simple cases go to templates or rules.
Sonnet 5 handles cases that need reasoning or tool use.
Opus 4.8 reviews high-risk replies.
A human approves refunds, legal issues, and enterprise escalations.

This works because Sonnet 5 should not spend premium tokens discovering that "thanks" is not a production incident.

Use this pattern for:

Support triage
Lead scoring
Inbox routing
Feedback clustering
Moderation queues
CRM cleanup
Form intake

Cheap models are great at boring gates. Let them be boring.

Pattern 2: Sonnet 5 as coding agent, Opus 4.8 as reviewer

For coding teams, Sonnet 5 can be the implementation layer while Opus 4.8 acts as the reviewer. That split gives you speed on normal tickets and better judgment on architecture, security, and edge cases before a human signs off.

A practical flow:

Sonnet 5 reads the issue.
Sonnet 5 inspects relevant files.
Sonnet 5 writes a short plan.
Sonnet 5 edits code.
Sonnet 5 runs tests.
Opus 4.8 reviews the diff, edge cases, and architecture.
Human approves the merge.

This uses each model where it makes sense. Sonnet 5 does the work that needs repetition and tool use. Opus 4.8 handles deeper review.

Use Sonnet 5 for:

Small features
Bug fixes
Test generation
Docs updates
Refactors with clear scope
Dependency cleanup
Reproducing failures

Use Opus 4.8 for:

Security-sensitive code
Architecture tradeoffs
Large refactors
Unclear failures
Reviewing model-generated patches

The mistake is skipping review because the model sounds confident. Confidence is not CI.

Pattern 3: Sonnet 5 first, Fable 5 only when the task earns it

Fable 5 should enter the workflow only when the problem is hard enough to justify the price. Start with Sonnet 5, escalate through Opus when needed, and reserve Fable for long-horizon tasks where failure would cost more than the model bill.

Use Sonnet 5 first when:

The task is normal but meaningful
The scope is clear
You can verify the output
The workflow repeats often
Cost matters

Escalate to Fable 5 when:

The task spans multiple systems
A wrong answer is expensive
Sonnet and Opus keep failing
The task requires long-horizon reasoning
The output will shape a major decision

Example: multi-repo migration.

Sonnet 5 can inventory repos, summarize dependencies, identify obvious risks, and draft the first migration plan. Opus 4.8 can review the architecture. Fable 5 can stress-test the full strategy before the team commits weeks of engineering time.

That is a rational use of an expensive model. Using Fable 5 to rewrite a tooltip is how your API bill starts wearing a little crown.

Pattern 4: Sonnet 5 plus GPT models for polish and variation

If your stack includes GPT models, use them for the jobs where they outperform your Claude setup. Sonnet 5 can provide structure, source checking, and execution, while another model can help with hooks, tone variations, or multimodal product polish before review.

A useful content workflow might look like this:

Sonnet 5 researches the topic and builds the factual outline.
A GPT model generates headline or hook variants.
Sonnet 5 checks the variants against the source material.
Opus 4.8 reviews high-stakes positioning.
A human approves anything public.

For marketing, this matters. One model can be better at structure. Another can be better at creative variation. Neither should be trusted to publish unchecked.

This is especially true for developer content. The internet can smell synthetic enthusiasm from three tabs away.

Pattern 5: Sonnet 5 plus Gemini for long-context or multimodal tasks

Gemini can be useful when a workflow starts with huge documents, screenshots, video, or other multimodal inputs. Sonnet 5 can then turn that material into technical synthesis, implementation plans, or structured outputs that developers can actually use in their stack.

Example product research flow:

Gemini processes a large batch of screenshots, transcripts, or long documents.
Sonnet 5 turns findings into a product brief or implementation plan.
Opus 4.8 reviews the tradeoffs.
Human makes the product decision.

This is a clean split. Multimodal and long-context ingestion can happen in one lane. Technical synthesis and execution can happen in another.

Model routing is not betrayal. It is just engineering.

Pattern 6: Sonnet 5 plus open-source models

Open-source models are a cost and privacy lever, not a downgrade by default. Use them for local preprocessing, tagging, extraction, and low-risk cleanup, then send the reasoning-heavy or tool-using parts to Sonnet 5 when quality matters.

Use local or open-source models for:

Simple extraction
Text cleanup
Classification
Tagging
Data preprocessing
Private internal workflows
Offline tools
Bulk low-risk tasks

Use Sonnet 5 when the workflow needs:

Stronger reasoning
Tool use
Code edits
Multi-step execution
Better synthesis
Higher reliability

Example developer knowledge base:

Local model tags and chunks internal docs.
Retrieval finds relevant sections.
Sonnet 5 answers with citations.
Opus 4.8 reviews security or compliance-sensitive answers.

This keeps the expensive model focused on the part that needs it.

Pattern 7: Sonnet 5 agents with approval gates

Sonnet 5 is built for agentic workflows, but public or destructive actions still need approval gates. Let it inspect, draft, test, and prepare changes, but keep humans in control of sending, publishing, charging, deleting, merging, and production infrastructure changes always.

A safe Sonnet 5 agent should have:

Clear scope
Tool permissions
Budget limits
Retry limits
Logs
Test commands
Stop conditions
Human approval for external side effects

Use approval gates for:

Sending emails
Posting publicly
Charging customers
Deleting data
Merging code
Changing production infrastructure
Updating customer records

The model can drive the workflow. It should not own the keys.

A practical routing table

A practical routing table keeps the model stack honest. It protects the budget, reduces unnecessary escalations, and makes it clear where human approval belongs. Start with a simple table, then replace assumptions with real eval data from production-like tasks.

Workflow	First model	Escalation	Human gate
Classify tickets	Cheap fast model	Sonnet 5	Only for high-risk cases
Draft support reply	Sonnet 5	Opus 4.8	Yes for refunds, legal, enterprise
Implement coding ticket	Sonnet 5	Opus 4.8	Yes before merge
Review architecture	Opus 4.8	Fable 5	Yes
Multi-repo migration	Opus 4.8	Fable 5	Yes
Bulk summaries	Cheap fast model	Sonnet 5	Usually no
Research brief	Sonnet 5	Opus 4.8	Yes if strategic
Creative writing	Test multiple models	Human edit	Yes if public
Public marketing copy	Sonnet 5 plus creative model	Opus 4.8	Always
Private preprocessing	Local model	Sonnet 5	Depends on output

Start here, then replace the guesses with evals.

How to test if Sonnet 5 belongs in your stack

The only useful Sonnet 5 evaluation is one built from your own tasks. Run it on real tickets, real docs, real automations, and real creative drafts. Then compare completion rate, correction time, latency, retry rate, and cost per useful result.

Run it against real work:

Five coding tickets
Five support tickets
Five research tasks
Five internal automations
Five creative drafts, if that matters to your product

Measure:

Completion rate
Human correction time
Token cost per completed task
Latency
Retry rate
Escalation rate
User satisfaction
Safety false positives

Reddit is useful for pattern recognition. Your own evals decide the routing table.

Final take

Sonnet 5 is not the emotional Claude comeback some users wanted, but it can still be operationally useful. Treat it as a workhorse in a routed model stack, then escalate to Opus 4.8 or Fable 5 when the task earns the extra cost.

It should handle the medium-effort work that fills real products: coding chores, subagents, research steps, internal automation, structured drafts, and tool-using workflows.

It should not replace every other model. Use cheaper models before it, Opus 4.8 above it, Fable 5 only when the task earns it, and local or specialized models where they make more sense.

The best LLM stack is not loyal. It is useful.

Ship the workflow. Route the models. Keep the human review where it belongs.

On-theme pick

Self Improving Agent Shirt

Sonnet 5 is built for agent loops, tool use, and the weird joy of watching software inspect its own work. Self Improving Agent is the shirt for that exact era.

From €29.90

View the shirt Shop developer shirts

Frequently Asked Questions

Should Claude Sonnet 5 replace every other LLM?

No. Sonnet 5 should be one layer in a model stack. Use cheaper models for routing and extraction, Sonnet 5 for medium-effort execution, Opus 4.8 for review, and Fable 5 for the hardest high-value tasks.

What should I use Sonnet 5 for?

Use Sonnet 5 for coding tickets, codebase scanning, research synthesis, internal automation, support triage, structured drafts, and tool-using workflows. It is strongest when a task needs real reasoning but happens often enough that cost still matters.

When should I use Opus 4.8 instead of Sonnet 5?

Use Opus 4.8 when the work is complex, ambiguous, security-sensitive, architecture-heavy, or expensive to get wrong. It is a better escalation model when Sonnet 5 fails, loops, or needs stronger judgment than a default workhorse can provide.

When should I use Fable 5?

Use Fable 5 for the hardest long-horizon tasks where the added capability is worth the higher price. Good examples include multi-repo migrations, strategic technical planning, difficult debugging, and high-value decisions that cheaper models cannot handle reliably.

Can I combine Sonnet 5 with open-source models?

Yes. Open-source models are useful for private preprocessing, tagging, extraction, classification, and low-cost cleanup. Sonnet 5 can then handle the reasoning-heavy parts, such as synthesis, code changes, tool use, and final structured outputs.

About the Author

Emcy is the founder of Code Culture, a developer-native apparel brand trusted by 37K+ engineers. He writes about developer tools, AI workflows, and the strange little rituals of people who build software for a living.