# Token Usage & Cost Analysis **Version:** 1.0.0 **Date:** 2025-10-31 **Purpose:** Understand the true cost of concurrent agents vs. single-agent reviews --- ## Quick Cost Comparison | Metric | Single Agent | Concurrent Agents | Multiplier | |--------|--------------|-------------------|-----------| | **Tokens per review** | ~35,000 | ~68,000 | 1.9x | | **Monthly reviews (5M tokens)** | 142 | 73 | 0.5x | | **Cost multiplier** | 1x | 2x | - | | **Time to execute** | 39-62 min | 31-42 min | 0.6-0.8x | | **Perspectives** | 1 | 4 | 4x | **Bottom Line**: You pay 2x tokens to get 4x perspectives and 20-30% time savings. --- ## Detailed Token Breakdown ### Single Agent Review (Baseline) ``` STAGE 1: GIT PREPARATION (Main Thread) ├─ Git status check: ~500 tokens ├─ Git diff analysis: ~2,500 tokens ├─ File listing: ~500 tokens └─ Subtotal: ~3,500 tokens STAGES 2-5: COMPREHENSIVE ANALYSIS (Single Agent) ├─ Code review analysis: ~8,000 tokens ├─ Architecture analysis: ~10,000 tokens ├─ Security analysis: ~8,000 tokens ├─ Multi-perspective analysis: ~6,000 tokens └─ Subtotal: ~32,000 tokens STAGE 6: SYNTHESIS (Main Thread) ├─ Results consolidation: ~3,000 tokens ├─ Action plan creation: ~2,000 tokens └─ Subtotal: ~5,000 tokens STAGES 7-9: INTERACTIVE RESOLUTION (Main Thread) ├─ User interaction: Variable (assume 2,000 tokens) ├─ Pre-push verification: ~1,500 tokens ├─ Commit message generation: ~500 tokens └─ Subtotal: ~4,000 tokens TOTAL SINGLE AGENT: ~44,500 tokens (~35,000-45,000 typical) ``` ### Concurrent Agents Review ``` STAGE 1: GIT PREPARATION (Main Thread) ├─ Git status check: ~500 tokens ├─ Git diff analysis: ~2,500 tokens ├─ File listing: ~500 tokens └─ Subtotal: ~3,500 tokens STAGE 2: CODE REVIEW AGENT (Independent Context) ├─ Agent initialization: ~2,000 tokens │ (re-establishing context, no shared history) ├─ Git diff input: ~2,000 tokens │ (agent needs own copy of diff) ├─ Code quality analysis: ~10,000 tokens │ (duplication, errors, secrets, style) ├─ Results generation: ~1,500 tokens └─ Subtotal: ~15,500 tokens STAGE 3: ARCHITECTURE AUDIT AGENT (Independent Context) ├─ Agent initialization: ~2,000 tokens ├─ File structure input: ~2,500 tokens │ (agent needs file paths and structure) ├─ Architecture analysis: ~12,000 tokens │ (6-dimensional analysis) ├─ Results generation: ~1,500 tokens └─ Subtotal: ~18,000 tokens STAGE 4: SECURITY & COMPLIANCE AGENT (Independent Context) ├─ Agent initialization: ~2,000 tokens ├─ Code input for security review: ~2,000 tokens ├─ Security analysis: ~11,000 tokens │ (OWASP, dependencies, secrets) ├─ Results generation: ~1,000 tokens └─ Subtotal: ~16,000 tokens STAGE 5: MULTI-PERSPECTIVE AGENT (Independent Context) ├─ Agent initialization: ~2,000 tokens ├─ Feature description: ~1,500 tokens │ (agent needs less context, just requirements) ├─ Multi-perspective analysis: ~9,000 tokens │ (6 stakeholder perspectives) ├─ Results generation: ~1,000 tokens └─ Subtotal: ~13,500 tokens STAGE 6: SYNTHESIS (Main Thread) ├─ Results consolidation: ~4,000 tokens │ (4 sets of results to aggregate) ├─ Action plan creation: ~2,500 tokens └─ Subtotal: ~6,500 tokens STAGES 7-9: INTERACTIVE RESOLUTION (Main Thread) ├─ User interaction: Variable (assume 2,000 tokens) ├─ Pre-push verification: ~1,500 tokens ├─ Commit message generation: ~500 tokens └─ Subtotal: ~4,000 tokens TOTAL CONCURRENT AGENTS: ~76,500 tokens (~68,000-78,000 typical) ``` ### Why Concurrent Costs More ``` Cost Difference Breakdown: Extra overhead from concurrent approach: ├─ Agent initialization (4x): 8,000 tokens │ (each agent re-establishes context) ├─ Input duplication (4x): 8,000 tokens │ (each agent gets its own copy of files) ├─ Result aggregation: 2,000 tokens │ (main thread consolidates 4 result sets) ├─ Synthesis complexity: 1,500 tokens │ (harder to merge 4 perspectives) └─ API overhead: ~500 tokens (4 separate API requests) TOTAL EXTRA COST: ~20,000 tokens (~32,000 base + 20,000 overhead = 52,000) BUT agents run in parallel, so you might expect: - Sequential single agent: 44,500 tokens - Concurrent 4 agents: 44,500 / 4 = 11,125 per agent - Total: ~44,500 tokens ACTUAL concurrent: 76,500 tokens Why the gap? - No shared context between agents - Each agent re-does setup - Each agent needs full input data - Results aggregation is not "free" ``` --- ## Token Cost by Analysis Type ### Code Review Agent Token Budget ``` Input Processing: ├─ Git diff loading: ~2,000 tokens ├─ File context: ~1,000 tokens └─ Subtotal: ~3,000 tokens Analysis: ├─ Readability review: ~2,000 tokens ├─ Duplication detection: ~2,000 tokens ├─ Error handling check: ~2,000 tokens ├─ Secret detection: ~1,500 tokens ├─ Test coverage review: ~1,500 tokens ├─ Performance analysis: ~1,000 tokens └─ Subtotal: ~10,000 tokens Output: ├─ Formatting results: ~1,000 tokens ├─ Severity prioritization: ~500 tokens └─ Subtotal: ~1,500 tokens Code Review Total: ~14,500 tokens ``` ### Architecture Audit Agent Token Budget ``` Input Processing: ├─ File structure loading: ~2,500 tokens ├─ Module relationship mapping: ~2,000 tokens └─ Subtotal: ~4,500 tokens Analysis (6 dimensions): ├─ Architecture & Design: ~2,500 tokens ├─ Code Quality: ~2,000 tokens ├─ Security: ~2,000 tokens ├─ Performance: ~1,500 tokens ├─ Testing: ~1,500 tokens ├─ Maintainability: ~1,500 tokens └─ Subtotal: ~11,000 tokens Output: ├─ Dimension scoring: ~1,500 tokens ├─ Recommendations: ~1,000 tokens └─ Subtotal: ~2,500 tokens Architecture Total: ~18,000 tokens ``` ### Security & Compliance Agent Token Budget ``` Input Processing: ├─ Code loading: ~2,000 tokens ├─ Dependency list: ~1,000 tokens └─ Subtotal: ~3,000 tokens Analysis: ├─ OWASP Top 10 check: ~3,000 tokens ├─ Dependency vulnerability scan: ~2,500 tokens ├─ Secrets/keys detection: ~2,000 tokens ├─ Encryption review: ~1,500 tokens ├─ Auth/AuthZ review: ~1,500 tokens ├─ Compliance requirements: ~1,000 tokens └─ Subtotal: ~11,500 tokens Output: ├─ Severity assessment: ~1,000 tokens ├─ Remediation guidance: ~1,000 tokens └─ Subtotal: ~2,000 tokens Security Total: ~16,500 tokens ``` ### Multi-Perspective Agent Token Budget ``` Input Processing: ├─ Feature description: ~1,500 tokens ├─ Change summary: ~1,000 tokens └─ Subtotal: ~2,500 tokens Analysis (6 perspectives): ├─ Product perspective: ~1,500 tokens ├─ Dev perspective: ~1,500 tokens ├─ QA perspective: ~1,500 tokens ├─ Security perspective: ~1,500 tokens ├─ DevOps perspective: ~1,000 tokens ├─ Design perspective: ~1,000 tokens └─ Subtotal: ~8,000 tokens Output: ├─ Stakeholder summary: ~1,500 tokens ├─ Risk assessment: ~1,000 tokens └─ Subtotal: ~2,500 tokens Multi-Perspective Total: ~13,000 tokens ``` --- ## Monthly Cost Comparison ### Scenario: 5M Token Monthly Budget ``` SINGLE AGENT APPROACH ├─ Tokens per review: ~35,000 ├─ Reviews per month: 5,000,000 / 35,000 = 142 reviews ├─ Cost efficiency: Excellent └─ Best for: High-frequency reviews, rapid feedback CONCURRENT AGENTS APPROACH ├─ Tokens per review: ~68,000 ├─ Reviews per month: 5,000,000 / 68,000 = 73 reviews ├─ Cost efficiency: Half as many reviews └─ Best for: Selective, high-quality reviews COST COMPARISON ├─ Same budget: 5M tokens ├─ Single agent can do: 142 reviews ├─ Concurrent can do: 73 reviews ├─ Sacrifice: 69 fewer reviews per month ├─ Gain: 4 expert perspectives per review ``` ### Pricing Impact (USD) Assuming Claude 3.5 Sonnet pricing (~$3 per 1M tokens): ``` SINGLE AGENT ├─ 35,000 tokens per review: $0.105 per review ├─ 142 reviews per month: $14.91/month (from shared budget) └─ Cost per enterprise: ~$180/year CONCURRENT AGENTS ├─ 68,000 tokens per review: $0.204 per review ├─ 73 reviews per month: $14.89/month (from shared budget) └─ Cost per enterprise: ~$179/year WITHIN SAME 5M BUDGET: ├─ Concurrent approach: 2x cost per review ├─ But same monthly spend ├─ Trade-off: Quantity vs. Quality ``` --- ## Optimization Strategies ### Strategy 1: Use Single Agent for Everyday ``` Mix Approach: ├─ 80% of code reviews: Single agent (~28,000 tokens avg) ├─ 20% of code reviews: Concurrent agents (for critical work) Monthly breakdown (5M budget): ├─ 80% single agent: ~114 reviews @ 28K tokens = ~3.2M tokens ├─ 20% concurrent agents: ~37 reviews @ 68K tokens = ~2.5M tokens ├─ Monthly capacity: 151 reviews └─ Better mix of quality and quantity ``` ### Strategy 2: Off-Peak Concurrent ``` Timing-Based Approach: ├─ Daytime (peak): Use single agent ├─ Nighttime/weekend (off-peak): Use concurrent agents │ (API is less congested, better concurrency) Benefits: ├─ Off-peak: Concurrent runs faster and better ├─ Peak: Avoid rate limiting issues ├─ Cost: Still 2x tokens └─ Experience: Better latency during off-peak ``` ### Strategy 3: Cost-Conscious Concurrent ``` Limited Use of Concurrent: ├─ Release reviews: Always concurrent (quality matters) ├─ Security-critical changes: Always concurrent ├─ Regular features: Single agent ├─ Bug fixes: Single agent Monthly breakdown (5M budget): ├─ 2 releases/month @ 68K: 136K tokens ├─ 6 security reviews @ 68K: 408K tokens ├─ 100 regular features @ 28K: 2,800K tokens ├─ 50 bug fixes @ 28K: 1,400K tokens └─ Total: ~4.7M tokens (stays within budget) ``` --- ## Reducing Token Costs ### For Concurrent Agents #### 1. Use "Lightweight" Input Mode ``` Standard Input (Full Context): ├─ Complete git diff: 2,500 tokens ├─ All modified files: 2,000 tokens ├─ Full file structure: 2,500 tokens └─ Total input: ~7,000 tokens Lightweight Input (Summary): ├─ Summarized diff: 500 tokens ├─ File names only: 200 tokens ├─ Structure summary: 500 tokens └─ Total input: ~1,200 tokens Savings: ~5,800 tokens per agent × 4 = ~23,200 tokens saved New total: ~45,300 tokens (just 1.3x single agent!) ``` #### 2. Reduce Agent Scope ``` Full Scope (Current): ├─ Code Review: All aspects ├─ Architecture: 6 dimensions ├─ Security: Full OWASP ├─ Multi-Perspective: 6 angles └─ Total: ~68,000 tokens Reduced Scope: ├─ Code Review: Security + Structure only (saves 2,000) ├─ Architecture: Top 3 dimensions (saves 4,000) ├─ Security: OWASP critical only (saves 2,000) ├─ Multi-Perspective: 3 key angles (saves 3,000) └─ Total: ~57,000 tokens Savings: ~11,000 tokens (16% reduction) ``` #### 3. Skip Non-Critical Agents ``` Full Pipeline (4 agents): └─ Total: ~68,000 tokens Critical Only (2 agents): ├─ Code Review Agent: ~15,000 tokens ├─ Security Agent: ~16,000 tokens └─ Total: ~31,000 tokens (same as single agent) Use when: - Simple changes (no architecture impact) - No security implications - Team review not needed ``` --- ## When Higher Token Cost is Worth It ### ROI Calculation ``` Extra cost per review: 33,000 tokens (~$0.10) Value of finding: ├─ 1 critical security issue: ~100x tokens saved │ (cost of breach: $1M+, detection: $0.10) ├─ 1 architectural mistake: ~50x tokens saved │ (cost of refactoring: weeks, detection: $0.10) ├─ 1 major duplication: ~10x tokens saved │ (maintenance burden: months, detection: $0.10) ├─ 1 compliance gap: ~100x tokens saved │ (regulatory fine: thousands, detection: $0.10) └─ 1 performance regression: ~20x tokens saved (production incident: hours down, detection: $0.10) ``` ### Examples Where ROI is Positive 1. **Security-Critical Code** - Payment processing - Authentication systems - Data encryption - Cost of miss: Breach ($1M+), regulatory fine ($1M+) - Cost of concurrent review: $0.10 - ROI: Infinite (one miss pays for millions of reviews) 2. **Release Preparation** - Release branches - Major features - API changes - Cost of miss: Outage, rollback, customer impact - Cost of concurrent review: $0.10 - ROI: Extremely high 3. **Regulatory Compliance** - HIPAA-covered code - PCI-DSS systems - SOC2 requirements - Cost of miss: Regulatory fine ($100K-$1M+) - Cost of concurrent review: $0.10 - ROI: Astronomical 4. **Enterprise Standards** - Multiple team sign-off - Audit trail requirement - Stakeholder input - Cost of miss: Rework, team friction - Cost of concurrent review: $0.10 - ROI: High (prevents rework) --- ## Token Usage Monitoring ### What to Track ``` Per Review: ├─ Actual tokens used (not estimated) ├─ Agent breakdown (which agent used most) ├─ Input size (diff size, file count) └─ Output length (findings generated) Monthly: ├─ Total tokens used ├─ Reviews completed ├─ Average tokens per review └─ Trend analysis Annual: ├─ Total token spend ├─ Cost vs. budget ├─ Reviews completed └─ ROI analysis ``` ### Setting Alerts ``` Rate Limit Alerts: ├─ 70% of TPM used in a minute → Warning ├─ 90% of TPM used in a minute → Critical ├─ Hit TPM limit → Block and notify Monthly Budget Alerts: ├─ 50% of budget used → Informational ├─ 75% of budget used → Warning ├─ 90% of budget used → Critical Cost Thresholds: ├─ Single review > 100K tokens → Unexpected (investigate) ├─ Average > 80K tokens → Possible over-analysis (review) ├─ Concurrent running during peak hours → Not optimal (schedule off-peak) ``` --- ## Cost Optimization Summary | Strategy | Token Saved | When to Use | |----------|-------------|------------| | **Mix single + concurrent** | Save 40% per month | Daily workflow | | **Off-peak scheduling** | Save 15% (better concurrency) | When possible | | **Lightweight input mode** | Save 35% per concurrent | Non-critical reviews | | **Reduce agent scope** | Save 15-20% | Simple changes | | **Skip non-critical agents** | Save 50% | Low-risk PRs | | **Single agent only** | 50% baseline cost | Cost-sensitive | --- ## Recommendation ``` Use Concurrent Agents When: ├─ Token budget > 5M per month ├─ Quality > Cost priority ├─ Security-critical code ├─ Release reviews ├─ Multiple perspectives needed └─ Regulatory requirements Use Single Agent When: ├─ Limited token budget ├─ High-frequency reviews needed ├─ Simple changes ├─ Speed important (20-30% gain not material) ├─ Cost sensitive └─ No multi-perspective requirement Use Mix Strategy When: ├─ Want both quality and quantity ├─ Can do selective high-value concurrent reviews ├─ Have moderate token budget ├─ Enterprise with varied code types └─ Want best of both worlds ``` --- **For full analysis, see [REALITY.md](REALITY.md) and [ARCHITECTURE.md](ARCHITECTURE.md).**