Vision-based auto-approval system for Claude Code CLI using MiniCPM-V vision model. Features: - Automatic detection and response to approval prompts - Screenshot capture and vision analysis via Ollama - Support for multiple screenshot tools (scrot, gnome-screenshot, etc.) - Configurable timing and behavior - Debug mode for troubleshooting - Comprehensive documentation Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Jean-Philippe Brule <jp@svrnty.io>
359 lines
5.3 KiB
Markdown
359 lines
5.3 KiB
Markdown
# Installation Guide
|
|
|
|
Detailed installation instructions for Claude Vision Auto.
|
|
|
|
## Table of Contents
|
|
|
|
1. [Prerequisites](#prerequisites)
|
|
2. [System Dependencies](#system-dependencies)
|
|
3. [Ollama Setup](#ollama-setup)
|
|
4. [Package Installation](#package-installation)
|
|
5. [Verification](#verification)
|
|
6. [Troubleshooting](#troubleshooting)
|
|
|
|
## Prerequisites
|
|
|
|
### 1. Claude Code CLI
|
|
|
|
Install Anthropic's official CLI:
|
|
|
|
```bash
|
|
npm install -g @anthropic-ai/claude-code
|
|
```
|
|
|
|
Verify installation:
|
|
|
|
```bash
|
|
claude --version
|
|
```
|
|
|
|
### 2. Python 3.8+
|
|
|
|
Check Python version:
|
|
|
|
```bash
|
|
python3 --version
|
|
```
|
|
|
|
If not installed:
|
|
|
|
```bash
|
|
sudo apt-get update
|
|
sudo apt-get install python3 python3-pip
|
|
```
|
|
|
|
### 3. Docker (for Ollama)
|
|
|
|
Install Docker:
|
|
|
|
```bash
|
|
curl -fsSL https://get.docker.com | sh
|
|
sudo usermod -aG docker $USER
|
|
```
|
|
|
|
Log out and back in for group changes to take effect.
|
|
|
|
## System Dependencies
|
|
|
|
### Screenshot Tool
|
|
|
|
Install `scrot` (recommended):
|
|
|
|
```bash
|
|
sudo apt-get update
|
|
sudo apt-get install -y scrot
|
|
```
|
|
|
|
Alternative screenshot tools:
|
|
|
|
```bash
|
|
# GNOME Screenshot
|
|
sudo apt-get install -y gnome-screenshot
|
|
|
|
# ImageMagick
|
|
sudo apt-get install -y imagemagick
|
|
|
|
# Maim
|
|
sudo apt-get install -y maim xdotool
|
|
```
|
|
|
|
### Additional Dependencies
|
|
|
|
```bash
|
|
sudo apt-get install -y \
|
|
python3-pip \
|
|
git \
|
|
curl
|
|
```
|
|
|
|
## Ollama Setup
|
|
|
|
### 1. Pull Ollama Docker Image
|
|
|
|
```bash
|
|
docker pull ollama/ollama:latest
|
|
```
|
|
|
|
### 2. Start Ollama Container
|
|
|
|
```bash
|
|
docker run -d \
|
|
-p 11434:11434 \
|
|
--name ollama \
|
|
--restart unless-stopped \
|
|
ollama/ollama:latest
|
|
```
|
|
|
|
For GPU support (NVIDIA):
|
|
|
|
```bash
|
|
docker run -d \
|
|
-p 11434:11434 \
|
|
--name ollama \
|
|
--gpus all \
|
|
--restart unless-stopped \
|
|
ollama/ollama:latest
|
|
```
|
|
|
|
### 3. Pull Vision Model
|
|
|
|
```bash
|
|
# MiniCPM-V (recommended - 5.5GB)
|
|
docker exec ollama ollama pull minicpm-v:latest
|
|
|
|
# Alternative: Llama 3.2 Vision (7.8GB)
|
|
docker exec ollama ollama pull llama3.2-vision:latest
|
|
|
|
# Alternative: LLaVA (4.5GB)
|
|
docker exec ollama ollama pull llava:latest
|
|
```
|
|
|
|
### 4. Verify Ollama
|
|
|
|
```bash
|
|
# Check container status
|
|
docker ps | grep ollama
|
|
|
|
# Test API
|
|
curl http://localhost:11434/api/tags
|
|
|
|
# List installed models
|
|
curl -s http://localhost:11434/api/tags | python3 -m json.tool
|
|
```
|
|
|
|
## Package Installation
|
|
|
|
### Method 1: Using Makefile (Recommended)
|
|
|
|
```bash
|
|
cd claude-vision-auto
|
|
|
|
# Install system dependencies
|
|
make deps
|
|
|
|
# Install package
|
|
make install
|
|
```
|
|
|
|
### Method 2: Manual Installation
|
|
|
|
```bash
|
|
cd claude-vision-auto
|
|
|
|
# Install system dependencies
|
|
sudo apt-get update
|
|
sudo apt-get install -y scrot python3-pip
|
|
|
|
# Install Python package
|
|
pip3 install -e .
|
|
```
|
|
|
|
### Method 3: From Git
|
|
|
|
```bash
|
|
# Clone repository
|
|
git clone https://git.openharbor.io/svrnty/claude-vision-auto.git
|
|
cd claude-vision-auto
|
|
|
|
# Install
|
|
pip3 install -e .
|
|
```
|
|
|
|
## Verification
|
|
|
|
### 1. Check Command Installation
|
|
|
|
```bash
|
|
which claude-vision
|
|
```
|
|
|
|
Expected output: `/home/username/.local/bin/claude-vision`
|
|
|
|
### 2. Test Ollama Connection
|
|
|
|
```bash
|
|
curl http://localhost:11434/api/tags
|
|
```
|
|
|
|
Should return JSON with list of models.
|
|
|
|
### 3. Test Screenshot
|
|
|
|
```bash
|
|
scrot /tmp/test_screenshot.png
|
|
ls -lh /tmp/test_screenshot.png
|
|
```
|
|
|
|
Should create a screenshot file.
|
|
|
|
### 4. Run Test
|
|
|
|
```bash
|
|
# Start claude-vision
|
|
claude-vision
|
|
|
|
# You should see:
|
|
# [Claude Vision Auto] Testing Ollama connection...
|
|
# [Claude Vision Auto] Connected to Ollama
|
|
# [Claude Vision Auto] Using model: minicpm-v:latest
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### "claude-vision: command not found"
|
|
|
|
Add to PATH in `~/.bashrc` or `~/.zshrc`:
|
|
|
|
```bash
|
|
export PATH="$HOME/.local/bin:$PATH"
|
|
```
|
|
|
|
Then reload:
|
|
|
|
```bash
|
|
source ~/.bashrc # or source ~/.zshrc
|
|
```
|
|
|
|
### "Cannot connect to Ollama"
|
|
|
|
Check if Ollama container is running:
|
|
|
|
```bash
|
|
docker ps | grep ollama
|
|
|
|
# If not running, start it:
|
|
docker start ollama
|
|
```
|
|
|
|
Check if port 11434 is open:
|
|
|
|
```bash
|
|
netstat -tulpn | grep 11434
|
|
# or
|
|
ss -tulpn | grep 11434
|
|
```
|
|
|
|
### "Model not found"
|
|
|
|
Pull the model:
|
|
|
|
```bash
|
|
docker exec ollama ollama pull minicpm-v:latest
|
|
```
|
|
|
|
List available models:
|
|
|
|
```bash
|
|
docker exec ollama ollama list
|
|
```
|
|
|
|
### "Screenshot failed"
|
|
|
|
Install scrot:
|
|
|
|
```bash
|
|
sudo apt-get install scrot
|
|
```
|
|
|
|
Test screenshot:
|
|
|
|
```bash
|
|
scrot -u /tmp/test.png
|
|
```
|
|
|
|
If error persists, try alternative tools in config:
|
|
|
|
```bash
|
|
export SCREENSHOT_TOOL="gnome-screenshot"
|
|
claude-vision
|
|
```
|
|
|
|
### Permission Issues
|
|
|
|
If pip install fails with permissions:
|
|
|
|
```bash
|
|
# Install for user only
|
|
pip3 install --user -e .
|
|
|
|
# Or use virtual environment
|
|
python3 -m venv venv
|
|
source venv/bin/activate
|
|
pip install -e .
|
|
```
|
|
|
|
### Docker Permission Denied
|
|
|
|
Add user to docker group:
|
|
|
|
```bash
|
|
sudo usermod -aG docker $USER
|
|
```
|
|
|
|
Log out and back in, then:
|
|
|
|
```bash
|
|
docker ps # Should work without sudo
|
|
```
|
|
|
|
## Uninstallation
|
|
|
|
### Remove Package
|
|
|
|
```bash
|
|
make uninstall
|
|
# or
|
|
pip3 uninstall claude-vision-auto
|
|
```
|
|
|
|
### Remove Ollama
|
|
|
|
```bash
|
|
docker stop ollama
|
|
docker rm ollama
|
|
docker rmi ollama/ollama
|
|
```
|
|
|
|
### Remove System Dependencies
|
|
|
|
```bash
|
|
sudo apt-get remove scrot
|
|
```
|
|
|
|
## Next Steps
|
|
|
|
After successful installation:
|
|
|
|
1. Read [USAGE.md](USAGE.md) for usage examples
|
|
2. Configure environment variables if needed
|
|
3. Test with a simple Claude Code command
|
|
|
|
## Getting Help
|
|
|
|
If you encounter issues not covered here:
|
|
|
|
1. Check the main [README.md](../README.md)
|
|
2. Enable debug mode: `DEBUG=true claude-vision`
|
|
3. Check logs: `~/.cache/claude-vision-auto/`
|
|
4. Report issues: https://git.openharbor.io/svrnty/claude-vision-auto/issues
|