Lesson 39 of 46 ~25 min
Course progress
0%

Building a CI/CD Pipeline Powered by Opus 4.6

Automate code review, test generation, documentation updates, and security auditing in your CI/CD pipeline — with a complete GitHub Actions integration example.

The most powerful application of Opus 4.6 is not a developer typing prompts in a terminal. It is an automated pipeline that reviews every pull request, generates missing tests, updates documentation, and scans for vulnerabilities — without any human initiating the process.

Pipeline Architecture

graph TD
    A[Developer Opens PR] --> B[GitHub Actions Triggered]
    B --> C[Checkout Code + Compute Diff]
    C --> D[Parallel Agent Jobs]
    
    D --> E[Code Review Agent]
    D --> F[Test Generation Agent]
    D --> G[Security Audit Agent]
    D --> H[Documentation Agent]
    
    E --> I[PR Comment: Review Findings]
    F --> J[Commit: Generated Tests]
    G --> K[PR Comment: Security Report]
    H --> L[Commit: Updated Docs]
    
    I --> M{All Checks Pass?}
    J --> M
    K --> M
    L --> M
    
    M -->|Yes| N[Ready for Human Review]
    M -->|No| O[Block Merge + Notify]

Each agent runs as a separate GitHub Actions job, in parallel. The pipeline completes in 2–5 minutes for a typical PR.

GitHub Actions Workflow

# .github/workflows/opus-agent-pipeline.yml
name: Opus 4.6 Agent Pipeline

on:
  pull_request:
    branches: [main, develop]
    types: [opened, synchronize]

permissions:
  contents: write
  pull-requests: write
  issues: write

env:
  ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
  MODEL: claude-opus-4-6-20260205

jobs:
  code-review:
    name: AI Code Review
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get PR diff
        id: diff
        run: |
          git diff origin/${{ github.base_ref }}...HEAD > pr_diff.patch
          echo "diff_size=$(wc -c < pr_diff.patch)" >> $GITHUB_OUTPUT

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install dependencies
        run: pip install anthropic

      - name: Run AI code review
        run: python .github/scripts/code-review.py
        env:
          PR_NUMBER: ${{ github.event.pull_request.number }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

  test-generation:
    name: AI Test Generation
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
          ref: ${{ github.head_ref }}
          token: ${{ secrets.GITHUB_TOKEN }}

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install dependencies
        run: npm ci

      - name: Get changed source files
        id: changed
        run: |
          FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD \
            | grep -E '\.(ts|tsx)$' \
            | grep -v '\.test\.' \
            | grep -v '\.spec\.' \
            | head -20)
          echo "files=$FILES" >> $GITHUB_OUTPUT

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install Anthropic SDK
        run: pip install anthropic

      - name: Generate missing tests
        run: python .github/scripts/generate-tests.py
        env:
          CHANGED_FILES: ${{ steps.changed.outputs.files }}

      - name: Run generated tests
        run: npm run test -- --passWithNoTests

      - name: Commit generated tests
        run: |
          git config user.name "opus-agent[bot]"
          git config user.email "opus-agent@users.noreply.github.com"
          git add '*.test.ts' '*.spec.ts'
          git diff --cached --quiet || git commit -m "test: add AI-generated tests for changed files"
          git push

  security-audit:
    name: AI Security Audit
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install dependencies
        run: pip install anthropic

      - name: Run security audit
        run: python .github/scripts/security-audit.py
        env:
          PR_NUMBER: ${{ github.event.pull_request.number }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

  docs-update:
    name: AI Documentation Update
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
          ref: ${{ github.head_ref }}
          token: ${{ secrets.GITHUB_TOKEN }}

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install dependencies
        run: pip install anthropic

      - name: Update documentation
        run: python .github/scripts/update-docs.py
        env:
          CHANGED_FILES: $(git diff --name-only origin/${{ github.base_ref }}...HEAD)

      - name: Commit documentation updates
        run: |
          git config user.name "opus-agent[bot]"
          git config user.email "opus-agent@users.noreply.github.com"
          git add 'docs/**' 'README.md' '*.md'
          git diff --cached --quiet || git commit -m "docs: auto-update documentation for changed code"
          git push

Agent Scripts

Code Review Agent

# .github/scripts/code-review.py
"""AI code review agent — posts findings as PR comments."""
import json
import os
import subprocess
from anthropic import Anthropic

def get_pr_diff() -> str:
    result = subprocess.run(
        ["git", "diff", f"origin/{os.environ.get('GITHUB_BASE_REF', 'main')}...HEAD"],
        capture_output=True, text=True
    )
    return result.stdout

def get_file_content(filepath: str) -> str:
    try:
        with open(filepath) as f:
            return f.read()
    except FileNotFoundError:
        return ""

def post_pr_comment(body: str):
    """Post a comment on the PR via GitHub API."""
    pr_number = os.environ["PR_NUMBER"]
    repo = os.environ["GITHUB_REPOSITORY"]
    token = os.environ["GITHUB_TOKEN"]
    
    subprocess.run([
        "curl", "-s", "-X", "POST",
        f"https://api.github.com/repos/{repo}/issues/{pr_number}/comments",
        "-H", f"Authorization: token {token}",
        "-H", "Content-Type: application/json",
        "-d", json.dumps({"body": body})
    ])

def main():
    client = Anthropic()
    diff = get_pr_diff()
    
    if len(diff.strip()) == 0:
        print("No changes to review.")
        return
    
    response = client.messages.create(
        model="claude-opus-4-6-20260205",
        max_tokens=4096,
        thinking={"type": "adaptive", "budget_tokens": 10000},
        system="""You are a principal engineer performing a code review.

        Rules:
        - Only flag issues that matter: bugs, security problems, performance issues
        - Do NOT comment on formatting, naming conventions, or style
        - For each issue, provide the file, line, severity, and a concrete fix
        - If the code is good, say so briefly — do not invent issues
        - Use markdown formatting for the PR comment""",
        messages=[{
            "role": "user",
            "content": f"Review this pull request diff:\n\n```diff\n{diff}\n```"
        }]
    )
    
    review = response.content[-1].text
    
    comment = f"""## 🤖 AI Code Review (Opus 4.6)

{review}

---
<sub>Automated review by Claude Opus 4.6 · 
[Pipeline docs](../docs/ai-pipeline.md)</sub>"""
    
    post_pr_comment(comment)
    print("Review posted to PR.")

if __name__ == "__main__":
    main()

Test Generation Agent

# .github/scripts/generate-tests.py
"""AI test generation agent — creates tests for changed files."""
import os
from pathlib import Path
from anthropic import Anthropic

def main():
    client = Anthropic()
    changed_files = os.environ.get("CHANGED_FILES", "").strip().split('\n')
    changed_files = [f for f in changed_files if f.strip()]
    
    if not changed_files:
        print("No source files changed.")
        return
    
    for filepath in changed_files:
        path = Path(filepath)
        if not path.exists():
            continue
        
        test_path = path.with_suffix('.test' + path.suffix)
        if test_path.exists():
            print(f"Test already exists: {test_path}")
            continue
        
        source_code = path.read_text()
        
        response = client.messages.create(
            model="claude-opus-4-6-20260205",
            max_tokens=4096,
            system="""You are a senior test engineer. Generate comprehensive 
            unit tests for the provided source code.
            
            Rules:
            - Use Vitest syntax (describe, it, expect)
            - Test happy paths, error cases, and edge cases
            - Mock external dependencies
            - Do NOT test implementation details — test behavior
            - Output ONLY the test file content, no explanations""",
            messages=[{
                "role": "user",
                "content": f"Generate tests for:\n\nFile: {filepath}\n\n"
                           f"```\n{source_code}\n```"
            }]
        )
        
        test_code = response.content[-1].text
        # Strip markdown code fences if present
        if test_code.startswith('```'):
            lines = test_code.split('\n')
            test_code = '\n'.join(lines[1:-1])
        
        test_path.write_text(test_code)
        print(f"Generated: {test_path}")

if __name__ == "__main__":
    main()

Cost Control

An uncontrolled agent pipeline can burn through API credits quickly. Implement guardrails:

# .github/scripts/cost-guard.py
"""Estimate and cap pipeline costs before running agents."""

MAX_COST_PER_PR = 5.00  # USD
INPUT_COST_PER_1M = 5.0   # Opus 4.6
OUTPUT_COST_PER_1M = 25.0  # Opus 4.6

def estimate_cost(input_tokens: int, output_tokens: int) -> float:
    return (input_tokens / 1_000_000 * INPUT_COST_PER_1M +
            output_tokens / 1_000_000 * OUTPUT_COST_PER_1M)

def check_diff_size():
    """Skip AI review for trivially small or excessively large diffs."""
    import subprocess
    result = subprocess.run(
        ["git", "diff", "--stat", "origin/main...HEAD"],
        capture_output=True, text=True
    )
    lines_changed = sum(
        int(x.split()[-2]) for x in result.stdout.strip().split('\n')[:-1]
        if len(x.split()) >= 3 and x.split()[-2].isdigit()
    )
    
    if lines_changed < 5:
        print("Diff too small for AI review. Skipping.")
        exit(0)
    
    if lines_changed > 5000:
        print(f"Diff too large ({lines_changed} lines). "
              f"Estimated cost exceeds ${MAX_COST_PER_PR}. Skipping.")
        exit(0)
    
    estimated = estimate_cost(
        input_tokens=lines_changed * 10,  # rough estimate
        output_tokens=2000
    )
    print(f"Estimated cost: ${estimated:.2f}")
    if estimated > MAX_COST_PER_PR:
        print(f"Exceeds cap of ${MAX_COST_PER_PR}. Skipping.")
        exit(0)

Cost Per PR — Realistic Estimates

PR SizeInput TokensOutput TokensEstimated Cost
Small (< 100 lines)~20,000~2,000~$0.15
Medium (100-500 lines)~80,000~5,000~$0.53
Large (500-2000 lines)~250,000~10,000~$1.50
Very Large (2000+ lines)~500,000~15,000~$2.88

With 50 PRs per week at medium size, the monthly cost is approximately $106. Compare this to the cost of a single missed bug reaching production.

Model Selection for Pipeline Jobs

Not every pipeline job needs Opus 4.6. Use the right model for each task:

graph TD
    A[Pipeline Job] --> B{Complexity?}
    B -->|High: Security, Architecture| C[Opus 4.6]
    B -->|Medium: Code Review, Tests| D[Sonnet 4.5]
    B -->|Low: Formatting, Linting| E[Haiku 4.5]
    
    C --> F["$5/$25 per 1M tokens"]
    D --> G["$3/$15 per 1M tokens"]
    E --> H["$0.25/$1.25 per 1M tokens"]
# In your workflow, select model per job
env:
  REVIEW_MODEL: claude-sonnet-4-5-20241022     # Good enough for reviews
  SECURITY_MODEL: claude-opus-4-6-20260205     # Need the best for security
  DOCS_MODEL: claude-haiku-4-5-20241022        # Fast and cheap for docs
  TEST_MODEL: claude-sonnet-4-5-20241022       # Sonnet handles test gen well

Preventing Agent Hallucinations in CI

The biggest risk of AI in CI/CD is an agent hallucinating a fix and committing broken code. Safeguards:

1. Never Auto-Merge

# The pipeline creates commits but never merges
# A human must approve the final merge
- name: Commit generated tests
  run: |
    git add '*.test.ts'
    git diff --cached --quiet || git commit -m "test: AI-generated"
    git push
    # NO auto-merge. Human reviews the AI's commits.

2. Verify Before Committing

# Always run tests on AI-generated code before committing
- name: Generate tests
  run: python .github/scripts/generate-tests.py

- name: Verify generated tests compile
  run: npx tsc --noEmit

- name: Verify generated tests pass
  run: npm run test -- --passWithNoTests

# Only then commit
- name: Commit if tests pass
  run: |
    git add '*.test.ts'
    git diff --cached --quiet || git commit -m "test: AI-generated (verified)"
    git push

3. Rate Limit Agent Comments

# Prevent comment spam on PRs with many pushes
import os, json, subprocess

def count_existing_bot_comments() -> int:
    pr_number = os.environ["PR_NUMBER"]
    repo = os.environ["GITHUB_REPOSITORY"]
    token = os.environ["GITHUB_TOKEN"]
    
    result = subprocess.run([
        "curl", "-s",
        f"https://api.github.com/repos/{repo}/issues/{pr_number}/comments",
        "-H", f"Authorization: token {token}",
    ], capture_output=True, text=True)
    
    comments = json.loads(result.stdout)
    return sum(1 for c in comments 
               if c.get("user", {}).get("login") == "opus-agent[bot]")

MAX_COMMENTS = 3
if count_existing_bot_comments() >= MAX_COMMENTS:
    print("Comment limit reached. Skipping to avoid spam.")
    exit(0)

Production Checklist

Before deploying the pipeline to your team:

  • Store ANTHROPIC_API_KEY as a GitHub Actions secret
  • Set a monthly spending cap on your Anthropic account
  • Configure cost guard to skip large diffs
  • Verify the pipeline on a test repository first
  • Set branch protection rules requiring human approval
  • Add the bot user to your .gitignore and CODEOWNERS
  • Monitor token usage weekly for the first month
  • Create a runbook for when the pipeline fails

This completes Module 6. You now have the tools and patterns to build autonomous coding workflows — from setup through refactoring, security auditing, and full CI/CD integration — all powered by Opus 4.6.