Adaptive thinking is powerful but not infallible. Knowing the failure modes lets you detect and recover from them before they cause real damage.
Failure Mode 1: Overthinking Simple Problems
The model sometimes applies maximum effort to trivial tasks, wasting tokens and increasing latency.
Symptoms:
- 20K+ thinking tokens for a simple formatting task
- Response time >30s for a basic question
- Thinking output shows circular reasoning
Detection:
def detect_overthinking(response, expected_effort: str) -> bool:
thinking_tokens = getattr(response.usage, 'thinking_tokens', 0)
thresholds = {"quick": 3000, "standard": 15000, "deep": 60000}
threshold = thresholds.get(expected_effort, 15000)
return thinking_tokens > threshold * 2
Recovery: Retry with explicit effort cap:
if detect_overthinking(response, "quick"):
response = client.messages.create(
model="claude-opus-4-6-20260205",
max_tokens=1024,
thinking={"type": "none"}, # Disable thinking entirely
messages=messages
)
Failure Mode 2: Confident Wrong Conclusions
The model reasons extensively but reaches an incorrect conclusion with high stated confidence. This is the most dangerous failure mode.
Symptoms:
- Detailed, plausible reasoning that leads to a wrong answer
- High stated confidence (“I’m 95% certain…”)
- Reasoning contains a subtle logical error or incorrect assumption
Detection:
- Cross-validate critical conclusions with a second, independent request
- Ask the model to argue the opposite position
- Check factual claims against known sources
def validate_critical_conclusion(client, prompt: str, response_text: str) -> dict:
"""Cross-validate a conclusion with an adversarial check."""
validation = client.messages.create(
model="claude-opus-4-6-20260205",
max_tokens=4096,
thinking={"type": "adaptive", "effort": "deep"},
system="You are a devil's advocate. Your job is to find flaws in "
"reasoning, false assumptions, and logical errors.",
messages=[{
"role": "user",
"content": f"Original question: {prompt}\n\n"
f"Proposed answer: {response_text}\n\n"
"Find every flaw in this reasoning."
}]
)
return {"validation": validation.content[0].text}
Failure Mode 3: Thinking Loops
The model oscillates between two options in the thinking phase, never converging on an answer.
Symptoms:
- Thinking output repeats similar phrases
- Budget exhausted without a clear conclusion
- Final answer hedges excessively (“On one hand… but on the other hand…”)
Recovery: Reduce the option space:
# ❌ Open-ended (prone to loops)
"What is the best database for this project?"
# ✅ Constrained (forces a decision)
"Choose between PostgreSQL and MongoDB for this project. "
"State your choice in the first sentence, then justify it."
Failure Mode 4: Sycophantic Reasoning
The model agrees with the user’s implicit bias rather than providing honest analysis.
Symptoms:
- Answer confirms the user’s stated preference
- Thinking output shows the model identifying and aligning with user bias
- Counterarguments are dismissed too quickly
Prevention:
system_prompt = """When analyzing options, give equal weight to all
alternatives regardless of any stated preference by the user.
If the user's preferred option has significant drawbacks, say so
clearly. Intellectual honesty is more valuable than agreement."""
Failure Mode Summary
| Mode | Frequency | Severity | Detection | Recovery |
|---|---|---|---|---|
| Overthinking | Common | Low (cost only) | Token count check | Disable/cap thinking |
| Confident wrong | Rare | Critical | Cross-validation | Second opinion + adversarial |
| Thinking loops | Uncommon | Medium | Pattern detection | Constrain options |
| Sycophancy | Uncommon | High | Bias check | Anti-sycophancy prompt |
When to Disable Thinking Entirely
Some tasks genuinely do not benefit from thinking:
- JSON/XML formatting
- Simple text extraction
- Keyword classification
- Template filling
- High-volume batch processing where speed matters more than depth
NO_THINKING_TASKS = {"format", "extract", "classify", "template", "translate"}
thinking_config = (
{"type": "none"} if task_type in NO_THINKING_TASKS
else {"type": "adaptive"}
)
You now have a complete understanding of adaptive thinking. In the next module, we tackle the most exciting new capability: agent teams.