The most powerful application of Opus 4.6 is not a developer typing prompts in a terminal. It is an automated pipeline that reviews every pull request, generates missing tests, updates documentation, and scans for vulnerabilities — without any human initiating the process.
Pipeline Architecture
graph TD
A[Developer Opens PR] --> B[GitHub Actions Triggered]
B --> C[Checkout Code + Compute Diff]
C --> D[Parallel Agent Jobs]
D --> E[Code Review Agent]
D --> F[Test Generation Agent]
D --> G[Security Audit Agent]
D --> H[Documentation Agent]
E --> I[PR Comment: Review Findings]
F --> J[Commit: Generated Tests]
G --> K[PR Comment: Security Report]
H --> L[Commit: Updated Docs]
I --> M{All Checks Pass?}
J --> M
K --> M
L --> M
M -->|Yes| N[Ready for Human Review]
M -->|No| O[Block Merge + Notify]
Each agent runs as a separate GitHub Actions job, in parallel. The pipeline completes in 2–5 minutes for a typical PR.
GitHub Actions Workflow
# .github/workflows/opus-agent-pipeline.yml
name: Opus 4.6 Agent Pipeline
on:
pull_request:
branches: [main, develop]
types: [opened, synchronize]
permissions:
contents: write
pull-requests: write
issues: write
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
MODEL: claude-opus-4-6-20260205
jobs:
code-review:
name: AI Code Review
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get PR diff
id: diff
run: |
git diff origin/${{ github.base_ref }}...HEAD > pr_diff.patch
echo "diff_size=$(wc -c < pr_diff.patch)" >> $GITHUB_OUTPUT
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install dependencies
run: pip install anthropic
- name: Run AI code review
run: python .github/scripts/code-review.py
env:
PR_NUMBER: ${{ github.event.pull_request.number }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
test-generation:
name: AI Test Generation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ github.head_ref }}
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install dependencies
run: npm ci
- name: Get changed source files
id: changed
run: |
FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD \
| grep -E '\.(ts|tsx)$' \
| grep -v '\.test\.' \
| grep -v '\.spec\.' \
| head -20)
echo "files=$FILES" >> $GITHUB_OUTPUT
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install Anthropic SDK
run: pip install anthropic
- name: Generate missing tests
run: python .github/scripts/generate-tests.py
env:
CHANGED_FILES: ${{ steps.changed.outputs.files }}
- name: Run generated tests
run: npm run test -- --passWithNoTests
- name: Commit generated tests
run: |
git config user.name "opus-agent[bot]"
git config user.email "opus-agent@users.noreply.github.com"
git add '*.test.ts' '*.spec.ts'
git diff --cached --quiet || git commit -m "test: add AI-generated tests for changed files"
git push
security-audit:
name: AI Security Audit
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install dependencies
run: pip install anthropic
- name: Run security audit
run: python .github/scripts/security-audit.py
env:
PR_NUMBER: ${{ github.event.pull_request.number }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
docs-update:
name: AI Documentation Update
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ github.head_ref }}
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install dependencies
run: pip install anthropic
- name: Update documentation
run: python .github/scripts/update-docs.py
env:
CHANGED_FILES: $(git diff --name-only origin/${{ github.base_ref }}...HEAD)
- name: Commit documentation updates
run: |
git config user.name "opus-agent[bot]"
git config user.email "opus-agent@users.noreply.github.com"
git add 'docs/**' 'README.md' '*.md'
git diff --cached --quiet || git commit -m "docs: auto-update documentation for changed code"
git push
Agent Scripts
Code Review Agent
# .github/scripts/code-review.py
"""AI code review agent — posts findings as PR comments."""
import json
import os
import subprocess
from anthropic import Anthropic
def get_pr_diff() -> str:
result = subprocess.run(
["git", "diff", f"origin/{os.environ.get('GITHUB_BASE_REF', 'main')}...HEAD"],
capture_output=True, text=True
)
return result.stdout
def get_file_content(filepath: str) -> str:
try:
with open(filepath) as f:
return f.read()
except FileNotFoundError:
return ""
def post_pr_comment(body: str):
"""Post a comment on the PR via GitHub API."""
pr_number = os.environ["PR_NUMBER"]
repo = os.environ["GITHUB_REPOSITORY"]
token = os.environ["GITHUB_TOKEN"]
subprocess.run([
"curl", "-s", "-X", "POST",
f"https://api.github.com/repos/{repo}/issues/{pr_number}/comments",
"-H", f"Authorization: token {token}",
"-H", "Content-Type: application/json",
"-d", json.dumps({"body": body})
])
def main():
client = Anthropic()
diff = get_pr_diff()
if len(diff.strip()) == 0:
print("No changes to review.")
return
response = client.messages.create(
model="claude-opus-4-6-20260205",
max_tokens=4096,
thinking={"type": "adaptive", "budget_tokens": 10000},
system="""You are a principal engineer performing a code review.
Rules:
- Only flag issues that matter: bugs, security problems, performance issues
- Do NOT comment on formatting, naming conventions, or style
- For each issue, provide the file, line, severity, and a concrete fix
- If the code is good, say so briefly — do not invent issues
- Use markdown formatting for the PR comment""",
messages=[{
"role": "user",
"content": f"Review this pull request diff:\n\n```diff\n{diff}\n```"
}]
)
review = response.content[-1].text
comment = f"""## 🤖 AI Code Review (Opus 4.6)
{review}
---
<sub>Automated review by Claude Opus 4.6 ·
[Pipeline docs](../docs/ai-pipeline.md)</sub>"""
post_pr_comment(comment)
print("Review posted to PR.")
if __name__ == "__main__":
main()
Test Generation Agent
# .github/scripts/generate-tests.py
"""AI test generation agent — creates tests for changed files."""
import os
from pathlib import Path
from anthropic import Anthropic
def main():
client = Anthropic()
changed_files = os.environ.get("CHANGED_FILES", "").strip().split('\n')
changed_files = [f for f in changed_files if f.strip()]
if not changed_files:
print("No source files changed.")
return
for filepath in changed_files:
path = Path(filepath)
if not path.exists():
continue
test_path = path.with_suffix('.test' + path.suffix)
if test_path.exists():
print(f"Test already exists: {test_path}")
continue
source_code = path.read_text()
response = client.messages.create(
model="claude-opus-4-6-20260205",
max_tokens=4096,
system="""You are a senior test engineer. Generate comprehensive
unit tests for the provided source code.
Rules:
- Use Vitest syntax (describe, it, expect)
- Test happy paths, error cases, and edge cases
- Mock external dependencies
- Do NOT test implementation details — test behavior
- Output ONLY the test file content, no explanations""",
messages=[{
"role": "user",
"content": f"Generate tests for:\n\nFile: {filepath}\n\n"
f"```\n{source_code}\n```"
}]
)
test_code = response.content[-1].text
# Strip markdown code fences if present
if test_code.startswith('```'):
lines = test_code.split('\n')
test_code = '\n'.join(lines[1:-1])
test_path.write_text(test_code)
print(f"Generated: {test_path}")
if __name__ == "__main__":
main()
Cost Control
An uncontrolled agent pipeline can burn through API credits quickly. Implement guardrails:
# .github/scripts/cost-guard.py
"""Estimate and cap pipeline costs before running agents."""
MAX_COST_PER_PR = 5.00 # USD
INPUT_COST_PER_1M = 5.0 # Opus 4.6
OUTPUT_COST_PER_1M = 25.0 # Opus 4.6
def estimate_cost(input_tokens: int, output_tokens: int) -> float:
return (input_tokens / 1_000_000 * INPUT_COST_PER_1M +
output_tokens / 1_000_000 * OUTPUT_COST_PER_1M)
def check_diff_size():
"""Skip AI review for trivially small or excessively large diffs."""
import subprocess
result = subprocess.run(
["git", "diff", "--stat", "origin/main...HEAD"],
capture_output=True, text=True
)
lines_changed = sum(
int(x.split()[-2]) for x in result.stdout.strip().split('\n')[:-1]
if len(x.split()) >= 3 and x.split()[-2].isdigit()
)
if lines_changed < 5:
print("Diff too small for AI review. Skipping.")
exit(0)
if lines_changed > 5000:
print(f"Diff too large ({lines_changed} lines). "
f"Estimated cost exceeds ${MAX_COST_PER_PR}. Skipping.")
exit(0)
estimated = estimate_cost(
input_tokens=lines_changed * 10, # rough estimate
output_tokens=2000
)
print(f"Estimated cost: ${estimated:.2f}")
if estimated > MAX_COST_PER_PR:
print(f"Exceeds cap of ${MAX_COST_PER_PR}. Skipping.")
exit(0)
Cost Per PR — Realistic Estimates
| PR Size | Input Tokens | Output Tokens | Estimated Cost |
|---|---|---|---|
| Small (< 100 lines) | ~20,000 | ~2,000 | ~$0.15 |
| Medium (100-500 lines) | ~80,000 | ~5,000 | ~$0.53 |
| Large (500-2000 lines) | ~250,000 | ~10,000 | ~$1.50 |
| Very Large (2000+ lines) | ~500,000 | ~15,000 | ~$2.88 |
With 50 PRs per week at medium size, the monthly cost is approximately $106. Compare this to the cost of a single missed bug reaching production.
Model Selection for Pipeline Jobs
Not every pipeline job needs Opus 4.6. Use the right model for each task:
graph TD
A[Pipeline Job] --> B{Complexity?}
B -->|High: Security, Architecture| C[Opus 4.6]
B -->|Medium: Code Review, Tests| D[Sonnet 4.5]
B -->|Low: Formatting, Linting| E[Haiku 4.5]
C --> F["$5/$25 per 1M tokens"]
D --> G["$3/$15 per 1M tokens"]
E --> H["$0.25/$1.25 per 1M tokens"]
# In your workflow, select model per job
env:
REVIEW_MODEL: claude-sonnet-4-5-20241022 # Good enough for reviews
SECURITY_MODEL: claude-opus-4-6-20260205 # Need the best for security
DOCS_MODEL: claude-haiku-4-5-20241022 # Fast and cheap for docs
TEST_MODEL: claude-sonnet-4-5-20241022 # Sonnet handles test gen well
Preventing Agent Hallucinations in CI
The biggest risk of AI in CI/CD is an agent hallucinating a fix and committing broken code. Safeguards:
1. Never Auto-Merge
# The pipeline creates commits but never merges
# A human must approve the final merge
- name: Commit generated tests
run: |
git add '*.test.ts'
git diff --cached --quiet || git commit -m "test: AI-generated"
git push
# NO auto-merge. Human reviews the AI's commits.
2. Verify Before Committing
# Always run tests on AI-generated code before committing
- name: Generate tests
run: python .github/scripts/generate-tests.py
- name: Verify generated tests compile
run: npx tsc --noEmit
- name: Verify generated tests pass
run: npm run test -- --passWithNoTests
# Only then commit
- name: Commit if tests pass
run: |
git add '*.test.ts'
git diff --cached --quiet || git commit -m "test: AI-generated (verified)"
git push
3. Rate Limit Agent Comments
# Prevent comment spam on PRs with many pushes
import os, json, subprocess
def count_existing_bot_comments() -> int:
pr_number = os.environ["PR_NUMBER"]
repo = os.environ["GITHUB_REPOSITORY"]
token = os.environ["GITHUB_TOKEN"]
result = subprocess.run([
"curl", "-s",
f"https://api.github.com/repos/{repo}/issues/{pr_number}/comments",
"-H", f"Authorization: token {token}",
], capture_output=True, text=True)
comments = json.loads(result.stdout)
return sum(1 for c in comments
if c.get("user", {}).get("login") == "opus-agent[bot]")
MAX_COMMENTS = 3
if count_existing_bot_comments() >= MAX_COMMENTS:
print("Comment limit reached. Skipping to avoid spam.")
exit(0)
Production Checklist
Before deploying the pipeline to your team:
- Store
ANTHROPIC_API_KEYas a GitHub Actions secret - Set a monthly spending cap on your Anthropic account
- Configure cost guard to skip large diffs
- Verify the pipeline on a test repository first
- Set branch protection rules requiring human approval
- Add the bot user to your
.gitignoreand CODEOWNERS - Monitor token usage weekly for the first month
- Create a runbook for when the pipeline fails
This completes Module 6. You now have the tools and patterns to build autonomous coding workflows — from setup through refactoring, security auditing, and full CI/CD integration — all powered by Opus 4.6.