Token Optimization
Maximize Claude’s effectiveness while minimizing token usage. Learn strategies to get more value from every token.
Understanding Token Economics
Token Basics
- ~4 characters = 1 token (English text)
- Code: Variable (symbols cost more than comments)
- Special characters: Usually 1 token each
- Whitespace: Counts too
Example:
"Hello, World!" = ~4 tokens
"const x = 5;" = ~5 tokens
"// Comment" = ~3 tokens
Token-Saving Strategies
1. Skip Unnecessary Pleasantries
❌ Token-Wasteful (15 tokens):
Hello! I hope you're having a great day!
Could you please help me with something?
✅ Token-Efficient (5 tokens):
Help me debug this code:
Savings: 10 tokens (67%)
2. Use Concise Context
❌ Verbose (50 tokens):
I am currently working on a project where I need to implement
a feature that allows users to upload files to the server.
The project uses Node.js and Express framework. I have been
trying to figure out the best way to do this...
✅ Concise (15 tokens):
Implementing file upload in Express.
Need guidance on best approach.
Savings: 35 tokens (70%)
3. Minimize Repeated Context
❌ Repeated Context:
Q1: In my React app using TypeScript with Next.js 14...
Q2: In my React app using TypeScript with Next.js 14...
Q3: In my React app using TypeScript with Next.js 14...
✅ Context Once:
Context: React app, TypeScript, Next.js 14
Q1: How to...
Q2: What about...
Q3: And for...
4. Use Abbreviations Where Clear
Common safe abbreviations:
- API (not Application Programming Interface)
- JWT (not JSON Web Token)
- DB (not Database)
- Auth (not Authentication)
- Config (not Configuration)
Use sparingly and only when unambiguous
Code Optimization
Send Only Relevant Code
❌ Full File (200 tokens):
import express from 'express';
import bodyParser from 'body-parser';
// ... 50 more lines of irrelevant code
const problematicFunction = () => {
// The actual issue is here
};
// ... 100 more lines
✅ Relevant Snippet (20 tokens):
// Issue in this function:
const problematicFunction = () => {
// The actual issue
};
Savings: 180 tokens (90%)
Use Minimal Reproducible Examples
Instead of full application:
// Minimal example showing the bug
const data = [1, 2, 3];
const result = data.map(x => x * 2); // Issue here
Prompt Optimization
Structure for Efficiency
❌ Unstructured (100 tokens):
So I have this problem where my database queries are really slow
and I'm not sure why. I'm using PostgreSQL and I have a users table
with about 10,000 rows and when I try to search for users by email
it takes like 5 seconds which is way too long and I need to make
it faster but I don't know how...
✅ Structured (40 tokens):
DB: PostgreSQL, 10k rows
Issue: Email search taking 5s
Query: SELECT * FROM users WHERE email = $1
Goal: <1s response
Savings: 60 tokens (60%)
Use Bullet Points
❌ Prose (50 tokens):
I need to implement authentication and it should support
email/password login and also OAuth with Google and
Facebook and it needs JWT tokens and refresh tokens.
✅ Bullets (25 tokens):
Need auth with:
- Email/password login
- OAuth (Google, Facebook)
- JWT + refresh tokens
Savings: 25 tokens (50%)
Response Optimization
Request Concise Responses
Explain [concept] in under 100 words.
Ask for Code Only When Needed
Just explain the approach, no code needed.
Or:
Show only the changed lines, not entire file.
Progressive Disclosure
Start simple, add complexity only if needed:
1. "Briefly explain the concept" (get overview)
2. "Show a simple example" (if needed)
3. "Now the complex cases" (if needed)
Advanced Optimization
Reference Previous Messages
❌ Repeat Everything:
Earlier I asked about the authentication system you designed
with JWT tokens and refresh tokens and email/password login...
✅ Reference:
Using the auth system from earlier, add 2FA.
Batch Related Questions
❌ Multiple Prompts:
Prompt 1: How do I handle errors in Express?
Prompt 2: How do I log errors in Express?
Prompt 3: How do I send error responses in Express?
✅ Single Batched Prompt:
Express error handling:
1. How to catch errors?
2. How to log them?
3. How to send responses?
Use Code Diffs
For modifications:
❌ Send Entire File:
// 200 lines of code with one small change
✅ Send Diff:
// Line 45, change:
- const timeout = 3000;
+ const timeout = 5000;
Practical Examples
Example 1: Debugging
Inefficient (150 tokens):
Hi! I'm working on a React application and I'm having some trouble
with a component. The component is supposed to fetch user data from
an API and display it, but it's not working correctly. When the page
loads, I see an error in the console that says "Cannot read property
'map' of undefined". I'm not sure what's causing this error. Here's
my complete component code: [100 lines of code]...
Efficient (40 tokens):
React component error: "Cannot read property 'map' of undefined"
const [users, setUsers] = useState();
return <ul>{users.map(...)}</ul>;
Why?
Savings: 110 tokens (73%)
Example 2: Implementation Request
Inefficient (80 tokens):
I would like to create a function that can take a user's email
address and validate whether it's in the correct format or not.
The function should return true if the email is valid and false
if it's not valid. It should handle common cases...
Efficient (20 tokens):
Create email validation function:
- Input: string
- Output: boolean
- Handle common formats
Savings: 60 tokens (75%)
Token Checklist
Before sending a prompt, optimize:
- Remove pleasantries
- Use concise language
- Include only relevant code
- Use bullet points
- Leverage previous context
- Batch related questions
- Request appropriate detail level
Balancing Act
Don’t Over-Optimize:
❌ Too terse:
API slow. Fix.
✅ Optimized but clear:
API response time: 5s
Goal: <500ms
Bottleneck: DB queries
Help optimize
When to Use More Tokens
Use more tokens when:
- Complex context is essential
- Ambiguity would cause errors
- Examples clarify requirements
- Multiple attempts would cost more
Token ROI (Return on Investment)
High ROI patterns:
- Clear, specific questions (low tokens, high value)
- Relevant code only (focused debugging)
- Batched related questions (efficiency)
Low ROI patterns:
- Vague questions requiring clarification
- Unnecessary full code listings
- Repeated context in every turn
- Social pleasantries
Key Takeaways
- Skip pleasantries - Be direct
- Minimize context - Include only what’s needed
- Use bullets - More efficient than prose
- Send relevant code only - Not entire files
- Batch questions - Group related asks
- Reference previous - Don’t repeat context
- Request appropriate depth - Not always max detail