Vision-based auto-approval system for Claude Code CLI using MiniCPM-V vision model. Features: - Automatic detection and response to approval prompts - Screenshot capture and vision analysis via Ollama - Support for multiple screenshot tools (scrot, gnome-screenshot, etc.) - Configurable timing and behavior - Debug mode for troubleshooting - Comprehensive documentation Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Jean-Philippe Brule <jp@svrnty.io>
5.3 KiB
Installation Guide
Detailed installation instructions for Claude Vision Auto.
Table of Contents
Prerequisites
1. Claude Code CLI
Install Anthropic's official CLI:
npm install -g @anthropic-ai/claude-code
Verify installation:
claude --version
2. Python 3.8+
Check Python version:
python3 --version
If not installed:
sudo apt-get update
sudo apt-get install python3 python3-pip
3. Docker (for Ollama)
Install Docker:
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
Log out and back in for group changes to take effect.
System Dependencies
Screenshot Tool
Install scrot (recommended):
sudo apt-get update
sudo apt-get install -y scrot
Alternative screenshot tools:
# GNOME Screenshot
sudo apt-get install -y gnome-screenshot
# ImageMagick
sudo apt-get install -y imagemagick
# Maim
sudo apt-get install -y maim xdotool
Additional Dependencies
sudo apt-get install -y \
python3-pip \
git \
curl
Ollama Setup
1. Pull Ollama Docker Image
docker pull ollama/ollama:latest
2. Start Ollama Container
docker run -d \
-p 11434:11434 \
--name ollama \
--restart unless-stopped \
ollama/ollama:latest
For GPU support (NVIDIA):
docker run -d \
-p 11434:11434 \
--name ollama \
--gpus all \
--restart unless-stopped \
ollama/ollama:latest
3. Pull Vision Model
# MiniCPM-V (recommended - 5.5GB)
docker exec ollama ollama pull minicpm-v:latest
# Alternative: Llama 3.2 Vision (7.8GB)
docker exec ollama ollama pull llama3.2-vision:latest
# Alternative: LLaVA (4.5GB)
docker exec ollama ollama pull llava:latest
4. Verify Ollama
# Check container status
docker ps | grep ollama
# Test API
curl http://localhost:11434/api/tags
# List installed models
curl -s http://localhost:11434/api/tags | python3 -m json.tool
Package Installation
Method 1: Using Makefile (Recommended)
cd claude-vision-auto
# Install system dependencies
make deps
# Install package
make install
Method 2: Manual Installation
cd claude-vision-auto
# Install system dependencies
sudo apt-get update
sudo apt-get install -y scrot python3-pip
# Install Python package
pip3 install -e .
Method 3: From Git
# Clone repository
git clone https://git.openharbor.io/svrnty/claude-vision-auto.git
cd claude-vision-auto
# Install
pip3 install -e .
Verification
1. Check Command Installation
which claude-vision
Expected output: /home/username/.local/bin/claude-vision
2. Test Ollama Connection
curl http://localhost:11434/api/tags
Should return JSON with list of models.
3. Test Screenshot
scrot /tmp/test_screenshot.png
ls -lh /tmp/test_screenshot.png
Should create a screenshot file.
4. Run Test
# Start claude-vision
claude-vision
# You should see:
# [Claude Vision Auto] Testing Ollama connection...
# [Claude Vision Auto] Connected to Ollama
# [Claude Vision Auto] Using model: minicpm-v:latest
Troubleshooting
"claude-vision: command not found"
Add to PATH in ~/.bashrc or ~/.zshrc:
export PATH="$HOME/.local/bin:$PATH"
Then reload:
source ~/.bashrc # or source ~/.zshrc
"Cannot connect to Ollama"
Check if Ollama container is running:
docker ps | grep ollama
# If not running, start it:
docker start ollama
Check if port 11434 is open:
netstat -tulpn | grep 11434
# or
ss -tulpn | grep 11434
"Model not found"
Pull the model:
docker exec ollama ollama pull minicpm-v:latest
List available models:
docker exec ollama ollama list
"Screenshot failed"
Install scrot:
sudo apt-get install scrot
Test screenshot:
scrot -u /tmp/test.png
If error persists, try alternative tools in config:
export SCREENSHOT_TOOL="gnome-screenshot"
claude-vision
Permission Issues
If pip install fails with permissions:
# Install for user only
pip3 install --user -e .
# Or use virtual environment
python3 -m venv venv
source venv/bin/activate
pip install -e .
Docker Permission Denied
Add user to docker group:
sudo usermod -aG docker $USER
Log out and back in, then:
docker ps # Should work without sudo
Uninstallation
Remove Package
make uninstall
# or
pip3 uninstall claude-vision-auto
Remove Ollama
docker stop ollama
docker rm ollama
docker rmi ollama/ollama
Remove System Dependencies
sudo apt-get remove scrot
Next Steps
After successful installation:
- Read USAGE.md for usage examples
- Configure environment variables if needed
- Test with a simple Claude Code command
Getting Help
If you encounter issues not covered here:
- Check the main README.md
- Enable debug mode:
DEBUG=true claude-vision - Check logs:
~/.cache/claude-vision-auto/ - Report issues: https://git.openharbor.io/svrnty/claude-vision-auto/issues