# Installation Guide Detailed installation instructions for Claude Vision Auto. ## Table of Contents 1. [Prerequisites](#prerequisites) 2. [System Dependencies](#system-dependencies) 3. [Ollama Setup](#ollama-setup) 4. [Package Installation](#package-installation) 5. [Verification](#verification) 6. [Troubleshooting](#troubleshooting) ## Prerequisites ### 1. Claude Code CLI Install Anthropic's official CLI: ```bash npm install -g @anthropic-ai/claude-code ``` Verify installation: ```bash claude --version ``` ### 2. Python 3.8+ Check Python version: ```bash python3 --version ``` If not installed: ```bash sudo apt-get update sudo apt-get install python3 python3-pip ``` ### 3. Docker (for Ollama) Install Docker: ```bash curl -fsSL https://get.docker.com | sh sudo usermod -aG docker $USER ``` Log out and back in for group changes to take effect. ## System Dependencies ### Screenshot Tool Install `scrot` (recommended): ```bash sudo apt-get update sudo apt-get install -y scrot ``` Alternative screenshot tools: ```bash # GNOME Screenshot sudo apt-get install -y gnome-screenshot # ImageMagick sudo apt-get install -y imagemagick # Maim sudo apt-get install -y maim xdotool ``` ### Additional Dependencies ```bash sudo apt-get install -y \ python3-pip \ git \ curl ``` ## Ollama Setup ### 1. Pull Ollama Docker Image ```bash docker pull ollama/ollama:latest ``` ### 2. Start Ollama Container ```bash docker run -d \ -p 11434:11434 \ --name ollama \ --restart unless-stopped \ ollama/ollama:latest ``` For GPU support (NVIDIA): ```bash docker run -d \ -p 11434:11434 \ --name ollama \ --gpus all \ --restart unless-stopped \ ollama/ollama:latest ``` ### 3. Pull Vision Model ```bash # MiniCPM-V (recommended - 5.5GB) docker exec ollama ollama pull minicpm-v:latest # Alternative: Llama 3.2 Vision (7.8GB) docker exec ollama ollama pull llama3.2-vision:latest # Alternative: LLaVA (4.5GB) docker exec ollama ollama pull llava:latest ``` ### 4. Verify Ollama ```bash # Check container status docker ps | grep ollama # Test API curl http://localhost:11434/api/tags # List installed models curl -s http://localhost:11434/api/tags | python3 -m json.tool ``` ## Package Installation ### Method 1: Using Makefile (Recommended) ```bash cd claude-vision-auto # Install system dependencies make deps # Install package make install ``` ### Method 2: Manual Installation ```bash cd claude-vision-auto # Install system dependencies sudo apt-get update sudo apt-get install -y scrot python3-pip # Install Python package pip3 install -e . ``` ### Method 3: From Git ```bash # Clone repository git clone https://git.openharbor.io/svrnty/claude-vision-auto.git cd claude-vision-auto # Install pip3 install -e . ``` ## Verification ### 1. Check Command Installation ```bash which claude-vision ``` Expected output: `/home/username/.local/bin/claude-vision` ### 2. Test Ollama Connection ```bash curl http://localhost:11434/api/tags ``` Should return JSON with list of models. ### 3. Test Screenshot ```bash scrot /tmp/test_screenshot.png ls -lh /tmp/test_screenshot.png ``` Should create a screenshot file. ### 4. Run Test ```bash # Start claude-vision claude-vision # You should see: # [Claude Vision Auto] Testing Ollama connection... # [Claude Vision Auto] Connected to Ollama # [Claude Vision Auto] Using model: minicpm-v:latest ``` ## Troubleshooting ### "claude-vision: command not found" Add to PATH in `~/.bashrc` or `~/.zshrc`: ```bash export PATH="$HOME/.local/bin:$PATH" ``` Then reload: ```bash source ~/.bashrc # or source ~/.zshrc ``` ### "Cannot connect to Ollama" Check if Ollama container is running: ```bash docker ps | grep ollama # If not running, start it: docker start ollama ``` Check if port 11434 is open: ```bash netstat -tulpn | grep 11434 # or ss -tulpn | grep 11434 ``` ### "Model not found" Pull the model: ```bash docker exec ollama ollama pull minicpm-v:latest ``` List available models: ```bash docker exec ollama ollama list ``` ### "Screenshot failed" Install scrot: ```bash sudo apt-get install scrot ``` Test screenshot: ```bash scrot -u /tmp/test.png ``` If error persists, try alternative tools in config: ```bash export SCREENSHOT_TOOL="gnome-screenshot" claude-vision ``` ### Permission Issues If pip install fails with permissions: ```bash # Install for user only pip3 install --user -e . # Or use virtual environment python3 -m venv venv source venv/bin/activate pip install -e . ``` ### Docker Permission Denied Add user to docker group: ```bash sudo usermod -aG docker $USER ``` Log out and back in, then: ```bash docker ps # Should work without sudo ``` ## Uninstallation ### Remove Package ```bash make uninstall # or pip3 uninstall claude-vision-auto ``` ### Remove Ollama ```bash docker stop ollama docker rm ollama docker rmi ollama/ollama ``` ### Remove System Dependencies ```bash sudo apt-get remove scrot ``` ## Next Steps After successful installation: 1. Read [USAGE.md](USAGE.md) for usage examples 2. Configure environment variables if needed 3. Test with a simple Claude Code command ## Getting Help If you encounter issues not covered here: 1. Check the main [README.md](../README.md) 2. Enable debug mode: `DEBUG=true claude-vision` 3. Check logs: `~/.cache/claude-vision-auto/` 4. Report issues: https://git.openharbor.io/svrnty/claude-vision-auto/issues