# Usage Guide Comprehensive usage guide for Claude Vision Auto. ## Table of Contents 1. [Basic Usage](#basic-usage) 2. [Configuration](#configuration) 3. [Common Scenarios](#common-scenarios) 4. [Advanced Usage](#advanced-usage) 5. [Tips and Best Practices](#tips-and-best-practices) ## Basic Usage ### Starting Interactive Session Simply replace `claude` with `claude-vision`: ```bash claude-vision ``` Expected output: ``` [Claude Vision Auto] Testing Ollama connection... [Claude Vision Auto] Connected to Ollama [Claude Vision Auto] Using model: minicpm-v:latest [Claude Vision Auto] Idle threshold: 3.0s ▐▛███▜▌ Claude Code v2.0.26 ▝▜█████▛▘ Sonnet 4.5 · Claude Max ▘▘ ▝▝ /home/username/project > ``` ### With Initial Prompt ```bash claude-vision "create a test.md file in /tmp" ``` ### How Auto-Approval Works When Claude asks for approval: ``` ╭───────────────────────────────────────────╮ │ Create file │ │ ╭───────────────────────────────────────╮ │ │ │ /tmp/test.md │ │ │ │ │ │ │ │ # Test File │ │ │ │ │ │ │ │ This is a test. │ │ │ ╰───────────────────────────────────────╯ │ │ Do you want to create test.md? │ │ ❯ 1. Yes │ │ 2. Yes, allow all edits │ │ 3. No │ ╰───────────────────────────────────────────╯ ``` Claude Vision Auto will: 1. Detect idle state (3 seconds) 2. Take screenshot 3. Analyze with vision model 4. Automatically select option 1 Output: ``` [Vision] Analyzing prompt... [Vision] Response: 1 [Vision] Response sent ``` ## Configuration ### Environment Variables Set before running `claude-vision`: ```bash # Example: Custom configuration export OLLAMA_URL="http://192.168.1.100:11434/api/generate" export VISION_MODEL="llama3.2-vision:latest" export IDLE_THRESHOLD="5.0" export RESPONSE_DELAY="2.0" export DEBUG="true" claude-vision ``` ### Persistent Configuration Add to `~/.bashrc` or `~/.zshrc`: ```bash # Claude Vision Auto Configuration export CLAUDE_VISION_OLLAMA_URL="http://localhost:11434/api/generate" export CLAUDE_VISION_MODEL="minicpm-v:latest" export CLAUDE_VISION_IDLE_THRESHOLD="3.0" export CLAUDE_VISION_RESPONSE_DELAY="1.0" # Alias for convenience alias cv="claude-vision" ``` Reload: ```bash source ~/.bashrc ``` ### Configuration Options Reference | Variable | Type | Default | Description | |----------|------|---------|-------------| | `OLLAMA_URL` | URL | `http://localhost:11434/api/generate` | Ollama API endpoint | | `VISION_MODEL` | String | `minicpm-v:latest` | Vision model name | | `IDLE_THRESHOLD` | Float | `3.0` | Seconds to wait before screenshot | | `RESPONSE_DELAY` | Float | `1.0` | Seconds to wait before responding | | `OUTPUT_BUFFER_SIZE` | Integer | `4096` | Buffer size in bytes | | `SCREENSHOT_TIMEOUT` | Integer | `5` | Screenshot timeout (seconds) | | `VISION_TIMEOUT` | Integer | `30` | Vision analysis timeout (seconds) | | `DEBUG` | Boolean | `false` | Enable debug logging | ## Common Scenarios ### Scenario 1: File Creation ```bash claude-vision "create a new Python script called hello.py" ``` Auto-approves file creation. ### Scenario 2: File Editing ```bash claude-vision "add error handling to main.py" ``` Auto-approves file edits. ### Scenario 3: Multiple Operations ```bash claude-vision "refactor the authentication module and add tests" ``` Auto-approves each operation sequentially. ### Scenario 4: Longer Wait Time For slower systems or models: ```bash IDLE_THRESHOLD="5.0" RESPONSE_DELAY="2.0" claude-vision ``` ### Scenario 5: Different Vision Model ```bash VISION_MODEL="llama3.2-vision:latest" claude-vision ``` ### Scenario 6: Remote Ollama ```bash OLLAMA_URL="http://192.168.1.100:11434/api/generate" claude-vision ``` ### Scenario 7: Debug Mode When troubleshooting: ```bash DEBUG=true claude-vision "test command" ``` Output will include: ``` [DEBUG] Screenshot saved to /home/user/.cache/claude-vision-auto/screenshot_1234567890.png [DEBUG] Sending to Ollama: http://localhost:11434/api/generate [DEBUG] Model: minicpm-v:latest [DEBUG] Vision model response: 1 ``` ## Advanced Usage ### Using Different Models #### MiniCPM-V (Default) Best for structured responses: ```bash VISION_MODEL="minicpm-v:latest" claude-vision ``` #### Llama 3.2 Vision Good alternative: ```bash VISION_MODEL="llama3.2-vision:latest" claude-vision ``` #### LLaVA Lightweight option: ```bash VISION_MODEL="llava:latest" claude-vision ``` ### Shell Aliases Add to `~/.bashrc`: ```bash # Quick aliases alias cv="claude-vision" alias cvd="DEBUG=true claude-vision" # Debug mode alias cvs="IDLE_THRESHOLD=5.0 claude-vision" # Slower # Model-specific alias cv-mini="VISION_MODEL=minicpm-v:latest claude-vision" alias cv-llama="VISION_MODEL=llama3.2-vision:latest claude-vision" ``` ### Integration with Scripts ```bash #!/bin/bash # auto-refactor.sh export DEBUG="false" export IDLE_THRESHOLD="3.0" claude-vision "refactor all JavaScript files to use modern ES6 syntax" ``` ### Conditional Auto-Approval Create wrapper script: ```bash #!/bin/bash # conditional-claude.sh if [ "$AUTO_APPROVE" = "true" ]; then claude-vision "$@" else claude "$@" fi ``` Usage: ```bash AUTO_APPROVE=true ./conditional-claude.sh "create files" ``` ### Multiple Terminal Support Each terminal needs separate instance: ```bash # Terminal 1 claude-vision # Project A # Terminal 2 claude-vision # Project B ``` ## Tips and Best Practices ### 1. Adjust Idle Threshold - **Fast system**: `IDLE_THRESHOLD=2.0` - **Slow system**: `IDLE_THRESHOLD=5.0` - **Remote Ollama**: `IDLE_THRESHOLD=4.0` ### 2. Model Selection - **Accuracy**: MiniCPM-V > Llama 3.2 Vision > LLaVA - **Speed**: LLaVA > MiniCPM-V > Llama 3.2 Vision - **Size**: LLaVA (4.5GB) < MiniCPM-V (5.5GB) < Llama 3.2 (7.8GB) ### 3. Debug When Needed Enable debug mode if responses are incorrect: ```bash DEBUG=true claude-vision ``` Check vision model output to see what it's detecting. ### 4. Screenshot Quality Ensure terminal is visible and not obscured by other windows. ### 5. Performance Optimization For faster responses: ```bash # Pre-warm the model docker exec ollama ollama run minicpm-v:latest "test" < /dev/null # Then use normally claude-vision ``` ### 6. Clean Up Old Screenshots Screenshots are auto-cleaned after 1 hour, but manual cleanup: ```bash rm -rf ~/.cache/claude-vision-auto/*.png ``` ### 7. Check Model Status Before starting long sessions: ```bash # Verify Ollama is responsive curl -s http://localhost:11434/api/tags | python3 -m json.tool # Check model is loaded docker exec ollama ollama ps ``` ## Troubleshooting During Use ### Issue: Not Auto-Responding **Symptoms**: Vision analyzes but doesn't respond **Solutions**: 1. Check if "WAIT" is returned: ```bash DEBUG=true claude-vision # Look for: [DEBUG] Vision model response: WAIT ``` 2. Increase idle threshold: ```bash IDLE_THRESHOLD="5.0" claude-vision ``` 3. Try different model: ```bash VISION_MODEL="llama3.2-vision:latest" claude-vision ``` ### Issue: Wrong Response **Symptoms**: Selects wrong option or types wrong answer **Solutions**: 1. Enable debug to see what vision model sees: ```bash DEBUG=true claude-vision ``` 2. Check screenshot manually: ```bash # Don't auto-clean ls ~/.cache/claude-vision-auto/ ``` 3. Adjust response delay: ```bash RESPONSE_DELAY="2.0" claude-vision ``` ### Issue: Slow Response **Symptoms**: Takes too long to respond **Solutions**: 1. Use faster model: ```bash VISION_MODEL="llava:latest" claude-vision ``` 2. Reduce vision timeout: ```bash VISION_TIMEOUT="15" claude-vision ``` 3. Check Ollama performance: ```bash docker stats ollama ``` ### Issue: Too Aggressive **Symptoms**: Responds to non-approval prompts **Solutions**: 1. Increase idle threshold: ```bash IDLE_THRESHOLD="5.0" claude-vision ``` 2. Check approval keywords in config 3. Use manual mode for sensitive operations: ```bash claude # No auto-approval ``` ## Getting Help If issues persist: 1. Check logs with `DEBUG=true` 2. Review documentation 3. Report issues with debug output 4. Include screenshot samples ## Next Steps - Explore [examples/](../examples/) directory - Customize configuration for your workflow - Create shell aliases for common tasks - Integrate with CI/CD pipelines (if applicable)