Vision-module-auto/docs/INSTALLATION.md

# Installation Guide

Detailed installation instructions for Claude Vision Auto.

## Table of Contents

1. [Prerequisites](#prerequisites)
2. [System Dependencies](#system-dependencies)
3. [Ollama Setup](#ollama-setup)
4. [Package Installation](#package-installation)
5. [Verification](#verification)
6. [Troubleshooting](#troubleshooting)

## Prerequisites

### 1. Claude Code CLI

Install Anthropic's official CLI:

```bash
npm install -g @anthropic-ai/claude-code
```

Verify installation:

```bash
claude --version
```

### 2. Python 3.8+

Check Python version:

```bash
python3 --version
```

If not installed:

```bash
sudo apt-get update
sudo apt-get install python3 python3-pip
```

### 3. Docker (for Ollama)

Install Docker:

```bash
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
```

Log out and back in for group changes to take effect.

## System Dependencies

### Screenshot Tool

Install `scrot` (recommended):

```bash
sudo apt-get update
sudo apt-get install -y scrot
```

Alternative screenshot tools:

```bash
# GNOME Screenshot
sudo apt-get install -y gnome-screenshot

# ImageMagick
sudo apt-get install -y imagemagick

# Maim
sudo apt-get install -y maim xdotool
```

### Additional Dependencies

```bash
sudo apt-get install -y \
    python3-pip \
    git \
    curl
```

## Ollama Setup

### 1. Pull Ollama Docker Image

```bash
docker pull ollama/ollama:latest
```

### 2. Start Ollama Container

```bash
docker run -d \
    -p 11434:11434 \
    --name ollama \
    --restart unless-stopped \
    ollama/ollama:latest
```

For GPU support (NVIDIA):

```bash
docker run -d \
    -p 11434:11434 \
    --name ollama \
    --gpus all \
    --restart unless-stopped \
    ollama/ollama:latest
```

### 3. Pull Vision Model

```bash
# MiniCPM-V (recommended - 5.5GB)
docker exec ollama ollama pull minicpm-v:latest

# Alternative: Llama 3.2 Vision (7.8GB)
docker exec ollama ollama pull llama3.2-vision:latest

# Alternative: LLaVA (4.5GB)
docker exec ollama ollama pull llava:latest
```

### 4. Verify Ollama

```bash
# Check container status
docker ps | grep ollama

# Test API
curl http://localhost:11434/api/tags

# List installed models
curl -s http://localhost:11434/api/tags | python3 -m json.tool
```

## Package Installation

### Method 1: Using Makefile (Recommended)

```bash
cd claude-vision-auto

# Install system dependencies
make deps

# Install package
make install
```

### Method 2: Manual Installation

```bash
cd claude-vision-auto

# Install system dependencies
sudo apt-get update
sudo apt-get install -y scrot python3-pip

# Install Python package
pip3 install -e .
```

### Method 3: From Git

```bash
# Clone repository
git clone https://git.openharbor.io/svrnty/claude-vision-auto.git
cd claude-vision-auto

# Install
pip3 install -e .
```

## Verification

### 1. Check Command Installation

```bash
which claude-vision
```

Expected output: `/home/username/.local/bin/claude-vision`

### 2. Test Ollama Connection

```bash
curl http://localhost:11434/api/tags
```

Should return JSON with list of models.

### 3. Test Screenshot

```bash
scrot /tmp/test_screenshot.png
ls -lh /tmp/test_screenshot.png
```

Should create a screenshot file.

### 4. Run Test

```bash
# Start claude-vision
claude-vision

# You should see:
# [Claude Vision Auto] Testing Ollama connection...
# [Claude Vision Auto] Connected to Ollama
# [Claude Vision Auto] Using model: minicpm-v:latest
```

## Troubleshooting

### "claude-vision: command not found"

Add to PATH in `~/.bashrc` or `~/.zshrc`:

```bash
export PATH="$HOME/.local/bin:$PATH"
```

Then reload:

```bash
source ~/.bashrc  # or source ~/.zshrc
```

### "Cannot connect to Ollama"

Check if Ollama container is running:

```bash
docker ps | grep ollama

# If not running, start it:
docker start ollama
```

Check if port 11434 is open:

```bash
netstat -tulpn | grep 11434
# or
ss -tulpn | grep 11434
```

### "Model not found"

Pull the model:

```bash
docker exec ollama ollama pull minicpm-v:latest
```

List available models:

```bash
docker exec ollama ollama list
```

### "Screenshot failed"

Install scrot:

```bash
sudo apt-get install scrot
```

Test screenshot:

```bash
scrot -u /tmp/test.png
```

If error persists, try alternative tools in config:

```bash
export SCREENSHOT_TOOL="gnome-screenshot"
claude-vision
```

### Permission Issues

If pip install fails with permissions:

```bash
# Install for user only
pip3 install --user -e .

# Or use virtual environment
python3 -m venv venv
source venv/bin/activate
pip install -e .
```

### Docker Permission Denied

Add user to docker group:

```bash
sudo usermod -aG docker $USER
```

Log out and back in, then:

```bash
docker ps  # Should work without sudo
```

## Uninstallation

### Remove Package

```bash
make uninstall
# or
pip3 uninstall claude-vision-auto
```

### Remove Ollama

```bash
docker stop ollama
docker rm ollama
docker rmi ollama/ollama
```

### Remove System Dependencies

```bash
sudo apt-get remove scrot
```

## Next Steps

After successful installation:

1. Read [USAGE.md](USAGE.md) for usage examples
2. Configure environment variables if needed
3. Test with a simple Claude Code command

## Getting Help

If you encounter issues not covered here:

1. Check the main [README.md](../README.md)
2. Enable debug mode: `DEBUG=true claude-vision`
3. Check logs: `~/.cache/claude-vision-auto/`
4. Report issues: https://git.openharbor.io/svrnty/claude-vision-auto/issues