feat(svrnty-vision): Phase 4b complete — full impl + e2e test suite

- palette.py + rembg.py: implement from stubs (Pillow median-cut + rembg u2net)
- vlm.py: rename Spark2→steev (Strix Halo / Ollama); bump max_tokens 1024→4096
  (qwen3-vl:32b thinking mode consumes budget tokens — 4096 min for valid output)
- settings.py: rename spark2_vlm_*/spark1_flux_* → vlm_*/flux_*; real defaults
  (steev 100.88.167.87:11434 Ollama, gx10 100.90.100.10:8188 ComfyUI)
- tests/: conftest.py + test_palette.py + test_rembg.py + test_integration_e2e.py
  (28 unit + 10 integration; 38/38 passing — VLM raw/polished/ugc + FLUX render)
- CLAUDE.md: rewrite to accurate phase status + infra + layout
- requirements.txt + pyproject.toml: add Pillow, rembg, pytest-asyncio deps

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Svrnty 2026-05-25 06:44:21 -04:00
parent d567489475
commit f6e09dbff2
15 changed files with 684 additions and 101 deletions

114
CLAUDE.md
View File

@ -4,75 +4,103 @@
contract from `/home/svrnty/workspaces/hermes/CLAUDE.md`. Read both before contract from `/home/svrnty/workspaces/hermes/CLAUDE.md`. Read both before
touching anything here.* touching anything here.*
## What this repo is ## What this is
A FastAPI HTTP gateway in front of four vision capabilities (VLM analysis, Standalone sovereign vision HTTP gateway. Four endpoints, two backends:
FLUX image generation, palette extraction, background removal). It is a
**sibling of `bte/`**, not a child. BTE calls it over HTTP. | Endpoint | Impl | Backend |
|---|---|---|
| `POST /vlm/analyze` | HTTP proxy | Qwen3-VL 32B · Ollama · svrnty-steev (Strix Halo) · `100.88.167.87:11434` |
| `POST /flux/render` | HTTP proxy + poll | FLUX.2-dev · ComfyUI · gx10-f38f · `100.90.100.10:8188` |
| `POST /palette/extract` | In-process | Pillow median-cut quantization |
| `POST /rembg/cutout` | In-process | rembg u2net ONNX |
| `GET /healthz` | Liveness probe | Always 200 |
**Sibling of `bte/`** — BTE calls it over HTTP via `SvrntyVisionGatewayClient`.
**Usable by any agent** — no BTE coupling in this repo. Agents can call the
endpoints directly (see `L4-svrnty.tool-vision` in cortex/ for the Go wrapper).
## Hard invariants ## Hard invariants
- **Thin gateway only.** Qwen3-VL runs on Spark 2 (vLLM). FLUX runs on - **VLM + FLUX are thin proxies only.** No model weights loaded in-process.
Spark 1 (ComfyUI). svrnty-vision proxies — it does NOT load model Pillow + rembg are the only in-process ML.
weights or pull torch/transformers/diffusers in-process. Two exceptions - **No cloud providers.** Sovereign-first. Anthropic/OpenAI/Google/Higgsfield
permitted in Phase 4b: `palette` (Pillow + colorthief) and `rembg` must never be re-introduced here.
(rembg lib) — both CPU-light, no GPU. - **Config via env only.** pydantic-settings + `.env` (gitignored). No
- **No cloud VLM providers.** The whole point of this extraction is to hardcoded IPs in code — all in settings.py defaults or overridden by `.env`.
delete Anthropic/OpenAI/Google/Higgsfield SDK dependencies from BTE. - **Port 8092.** BTE is configured to call `http://localhost:8092`.
Do not reintroduce them here. Sovereign-first.
- **Secrets via env only.** No keys in code, logs, or argv. Use
`pydantic-settings` + `.env` (gitignored).
- **Stay in Python ≥3.11.** Workspace standard.
## Phase status ## Phase status (BTE Phase 4 sub-phases)
| Phase | Scope | State | | Phase | Scope | State |
|---|---|---| |---|---|---|
| 4a | Scaffold: FastAPI shell, `/healthz`, four 501 stubs, tests | **done (this commit)** | | 4a | FastAPI scaffold + /healthz + 4 route stubs | ✅ done |
| 4b | Port real implementations from BTE; HTTP clients for Spark 1/2 | not started | | 4b | Implement vlm.py, flux.py, palette.py, rembg.py | ✅ done (2026-05-25) |
| 4c | Delete the corresponding .NET code from BTE | not started | | 4c | Delete .NET vision providers from BTE | ✅ done (BTE Phase 4 commit 3112135) |
| 4d | Wire BTE to call svrnty-vision over HTTP via thin adapter | not started | | 4d | Wire BTE → svrnty-vision via SvrntyVisionGatewayClient | ✅ done (BTE Phase 4 commit 3112135) |
See `/home/svrnty/workspaces/hermes/sot/01-ROADMAP/BTE-REFACTOR-EXECUTION-PLAN.md` ## Infrastructure (Tailscale)
and `/home/svrnty/workspaces/hermes/bte/docs/REFACTOR-AUDIT-2026-05-24.md` §3 V.
```
svrnty-steev 100.88.167.87 Strix Halo — Ollama — qwen3-vl:32b (VLM)
gx10-f38f 100.90.100.10 NVIDIA GB10 128GB — ComfyUI v0.18.1 (FLUX)
```
**ComfyUI FLUX.2 model set (gx10):**
- `diffusion_models/flux2_dev_fp8mixed.safetensors`
- `text_encoders/mistral_3_small_flux2_fp8.safetensors`
- `vae/flux2-vae.safetensors`
## Layout ## Layout
``` ```
src/svrnty_vision/ src/svrnty_vision/
server.py # FastAPI app + /healthz + router includes server.py # FastAPI app + /healthz + router includes
settings.py # pydantic-settings (env-driven) settings.py # pydantic-settings — all config here, no hardcodes
routers/ routers/
vlm.py # POST /vlm/analyze (501 stub → Spark 2) vlm.py # POST /vlm/analyze → Ollama (Qwen3-VL 32B)
flux.py # POST /flux/render (501 stub → Spark 1) flux.py # POST /flux/render → ComfyUI (FLUX.2-dev)
palette.py # POST /palette/extract (501 stub → in-process) palette.py # POST /palette/extract in-process (Pillow)
rembg.py # POST /rembg/cutout (501 stub → in-process) rembg.py # POST /rembg/cutout in-process (rembg)
tests/ tests/
test_healthz.py # TestClient smoke conftest.py # fixtures: TestClient, red_png_b64, gradient_png_b64
test_healthz.py # liveness + 501 stubs (pre-4b kept for regression)
test_vlm_parse.py # pure-function: rubric prompt + score parsing
test_flux_workflow.py # pure-function: stopgap FLUX.2 workflow builder
test_palette.py # unit: palette extraction (no network)
test_rembg.py # unit: background removal (no network)
test_integration_e2e.py # live e2e: VLM + FLUX + palette + rembg
``` ```
## Run / test ## Run / test
```sh ```sh
# Install
python -m venv .venv && source .venv/bin/activate python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt pip install -r requirements.txt && pip install -e .
pip install -e . # required: src/ layout
uvicorn svrnty_vision.server:app --port 8090 # serve # Serve (reads .env automatically)
pytest tests/ # test uvicorn svrnty_vision.server:app --host 0.0.0.0 --port 8092
# Unit tests (no network)
pytest tests/ -m "not integration"
# Full e2e (requires Tailscale + live Spark hosts)
pytest tests/ -m integration -v
``` ```
## Git ## Config (.env)
- Default branch: `jp` (workspace convention). ```
- Local-only until JP authorises the gitea push: SVRNTY_VISION_PORT=8092
`git remote add openharbor git@git.openharbor.io:svrnty/svrnty-vision.git` FLUX_URL=http://100.90.100.10:8188
then `git push -u openharbor jp`. VLM_URL=http://100.88.167.87:11434
VLM_MODEL=qwen3-vl:32b
VISION_REQUEST_TIMEOUT_SECONDS=120
```
## When extending ## When extending
- New endpoint? Add a router under `src/svrnty_vision/routers/`, register - New endpoint → new router under `routers/`, register in `server.py`, tests in `tests/`.
it in `server.py`, add a test in `tests/`. - New backend → add URL to `settings.py` + `.env.example`, never hardcode.
- New Spark dependency? Add the URL to `settings.py` and `.env.example`, - Surgical only. No cross-endpoint refactors while implementing one feature.
never hardcode.
- Surgical changes only. Don't refactor adjacent stubs while implementing
one — each phase has its own commit.

View File

@ -5,7 +5,7 @@ build-backend = "setuptools.build_meta"
[project] [project]
name = "svrnty-vision" name = "svrnty-vision"
version = "0.1.0" version = "0.1.0"
description = "Sovereign vision HTTP gateway — VLM analysis, FLUX image gen, palette extraction, background removal. Calls Spark services over HTTP." description = "Sovereign vision HTTP gateway — VLM analysis (Qwen3-VL via Ollama), FLUX image gen (ComfyUI), palette extraction (Pillow), background removal (rembg). Standalone; usable by any agent."
readme = "README.md" readme = "README.md"
requires-python = ">=3.11" requires-python = ">=3.11"
license = { text = "Proprietary" } license = { text = "Proprietary" }
@ -18,11 +18,15 @@ dependencies = [
"pydantic>=2.9,<3.0", "pydantic>=2.9,<3.0",
"pydantic-settings>=2.6,<3.0", "pydantic-settings>=2.6,<3.0",
"httpx>=0.27,<1.0", "httpx>=0.27,<1.0",
"Pillow>=11,<13",
"colorthief>=0.2.1",
"rembg>=2.0,<3.0",
] ]
[project.optional-dependencies] [project.optional-dependencies]
dev = [ dev = [
"pytest>=8.3,<9.0", "pytest>=8.3,<9.0",
"pytest-asyncio>=1.0",
] ]
[tool.setuptools.packages.find] [tool.setuptools.packages.find]
@ -31,3 +35,4 @@ where = ["src"]
[tool.pytest.ini_options] [tool.pytest.ini_options]
pythonpath = ["src"] pythonpath = ["src"]
testpaths = ["tests"] testpaths = ["tests"]
asyncio_mode = "auto"

View File

@ -3,6 +3,10 @@ uvicorn[standard]>=0.32,<1.0
pydantic>=2.9,<3.0 pydantic>=2.9,<3.0
pydantic-settings>=2.6,<3.0 pydantic-settings>=2.6,<3.0
httpx>=0.27,<1.0 httpx>=0.27,<1.0
Pillow>=11,<13
colorthief>=0.2.1
rembg>=2.0,<3.0
# Test deps (kept here for simplicity in Phase 4a) # Test deps
pytest>=8.3,<9.0 pytest>=8.3,<9.0
pytest-asyncio>=1.0

View File

@ -217,7 +217,7 @@ async def render(req: RenderRequest) -> RenderResponse:
req.prompt, req.width, req.height, req.guidance, req.steps, req.seed req.prompt, req.width, req.height, req.guidance, req.steps, req.seed
) )
endpoint = settings.spark1_flux_url.rstrip("/") endpoint = settings.flux_url.rstrip("/")
started = time.monotonic() started = time.monotonic()
try: try:

View File

@ -1,17 +1,69 @@
"""Palette extraction (ColorThief-equivalent) — stub until Phase 4b.""" """Palette extraction — dominant colors via Pillow median-cut quantization."""
from __future__ import annotations
import base64
import io
from fastapi import APIRouter, HTTPException, status from fastapi import APIRouter, HTTPException, status
from PIL import Image
from pydantic import BaseModel, Field
router = APIRouter(prefix="/palette", tags=["palette"]) router = APIRouter(prefix="/palette", tags=["palette"])
_MAX_DIM = 200 # downsample before quantize for speed
@router.post("/extract")
async def extract() -> None:
"""Extract a dominant-color palette from an image.
Phase 4a: stub. Phase 4b: runs in-process (Pillow + colorthief). class PaletteRequest(BaseModel):
""" """At least one of `image_base64` or `image_url` must be supplied."""
image_base64: str | None = None
image_url: str | None = None
content_type: str = "image/png"
color_count: int = Field(default=6, ge=1, le=32)
class PaletteResponse(BaseModel):
dominant: list[int] = Field(description="[R, G, B] — single most prominent color")
palette: list[list[int]] = Field(description="[[R,G,B], …] — all extracted colors")
color_count: int
def _load_image(req: PaletteRequest) -> Image.Image:
if req.image_base64:
try:
raw = base64.b64decode(req.image_base64)
except Exception as e:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=f"Bad base64: {e}") from e
return Image.open(io.BytesIO(raw)).convert("RGB")
if req.image_url:
raise HTTPException( raise HTTPException(
status_code=status.HTTP_501_NOT_IMPLEMENTED, status_code=status.HTTP_400_BAD_REQUEST,
detail="palette.extract not implemented in Phase 4a — see BTE-REFACTOR-EXECUTION-PLAN Phase 4b", detail="image_url not supported for palette — provide image_base64.",
) )
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Provide image_base64.")
@router.post("/extract", response_model=PaletteResponse)
async def extract(req: PaletteRequest) -> PaletteResponse:
"""Extract dominant colors via Pillow median-cut quantization."""
img = _load_image(req)
# Downsample for speed before quantizing.
if max(img.width, img.height) > _MAX_DIM:
img.thumbnail((_MAX_DIM, _MAX_DIM), Image.Resampling.LANCZOS)
quantized = img.quantize(colors=req.color_count, method=Image.Quantize.MEDIANCUT)
raw_palette = quantized.getpalette() or []
n = min(req.color_count, len(raw_palette) // 3)
palette = [[raw_palette[i * 3], raw_palette[i * 3 + 1], raw_palette[i * 3 + 2]] for i in range(n)]
# Most frequent color = dominant.
counts: dict[int, int] = {}
for px in quantized.get_flattened_data():
counts[px] = counts.get(px, 0) + 1
dominant_idx = max(counts, key=lambda k: counts[k]) if counts else 0
dominant = palette[dominant_idx] if dominant_idx < len(palette) else palette[0]
return PaletteResponse(dominant=dominant, palette=palette, color_count=len(palette))

View File

@ -1,17 +1,52 @@
"""Background removal — stub until Phase 4b.""" """Background removal — in-process via rembg (u2net ONNX, CPU-light)."""
from __future__ import annotations
import base64
import io
from fastapi import APIRouter, HTTPException, status from fastapi import APIRouter, HTTPException, status
from pydantic import BaseModel
router = APIRouter(prefix="/rembg", tags=["rembg"]) router = APIRouter(prefix="/rembg", tags=["rembg"])
@router.post("/cutout") class CutoutRequest(BaseModel):
async def cutout() -> None: """At least one of `image_base64` or `image_url` must be supplied."""
"""Remove the background of an image (alpha cutout).
Phase 4a: stub. Phase 4b: runs in-process (rembg) or proxies to a Spark service. image_base64: str | None = None
""" content_type: str = "image/png"
alpha_matting: bool = False # slower but cleaner edges on hair/fur
class CutoutResponse(BaseModel):
image_base64: str
content_type: str = "image/png"
alpha_matting: bool
@router.post("/cutout", response_model=CutoutResponse)
async def cutout(req: CutoutRequest) -> CutoutResponse:
"""Remove background. Returns PNG with transparency (alpha channel)."""
if not req.image_base64:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Provide image_base64.")
try:
raw = base64.b64decode(req.image_base64)
except Exception as e:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=f"Bad base64: {e}") from e
try:
from rembg import remove # lazy: downloads u2net model on first call
result_bytes = remove(raw, alpha_matting=req.alpha_matting)
except Exception as e:
raise HTTPException( raise HTTPException(
status_code=status.HTTP_501_NOT_IMPLEMENTED, status_code=status.HTTP_502_BAD_GATEWAY,
detail="rembg.cutout not implemented in Phase 4a — see BTE-REFACTOR-EXECUTION-PLAN Phase 4b", detail=f"rembg failed: {type(e).__name__}: {e}",
) from e
return CutoutResponse(
image_base64=base64.b64encode(result_bytes).decode("ascii"),
content_type="image/png",
alpha_matting=req.alpha_matting,
) )

View File

@ -1,4 +1,4 @@
"""VLM (vision-language model) analysis — proxies to Spark 2 (Qwen3-VL via vLLM). """VLM (vision-language model) analysis — proxies to steev (Qwen3-VL via Ollama).
Ported from BTE's OpenAiVlmClient.cs + VlmRubric.cs (Phase 4b). Cloud Anthropic Ported from BTE's OpenAiVlmClient.cs + VlmRubric.cs (Phase 4b). Cloud Anthropic
dialect intentionally dropped svrnty-vision is sovereign-only. dialect intentionally dropped svrnty-vision is sovereign-only.
@ -34,8 +34,8 @@ class AnalyzeRequest(BaseModel):
content_type: str = "image/png" content_type: str = "image/png"
brand_context: str = "" brand_context: str = ""
rubric_mode: str = "polished" rubric_mode: str = "polished"
model: str | None = None # override settings.spark2_vlm_model model: str | None = None # override settings.vlm_model
max_tokens: int = 1024 max_tokens: int = 4096 # qwen3-vl:32b thinking mode uses budget tokens; 4096 min for valid output
class AnalyzeResponse(BaseModel): class AnalyzeResponse(BaseModel):
@ -148,10 +148,10 @@ async def _resolve_data_uri(req: AnalyzeRequest) -> str:
@router.post("/analyze", response_model=AnalyzeResponse) @router.post("/analyze", response_model=AnalyzeResponse)
async def analyze(req: AnalyzeRequest) -> AnalyzeResponse: async def analyze(req: AnalyzeRequest) -> AnalyzeResponse:
"""Analyze an image with Qwen3-VL on Spark 2 (vLLM, OpenAI-compatible).""" """Analyze an image with Qwen3-VL on steev (Ollama) (vLLM, OpenAI-compatible)."""
data_uri = await _resolve_data_uri(req) data_uri = await _resolve_data_uri(req)
rubric = build_rubric_prompt(req.brand_context, req.rubric_mode) rubric = build_rubric_prompt(req.brand_context, req.rubric_mode)
model = req.model or settings.spark2_vlm_model model = req.model or settings.vlm_model
body: dict[str, Any] = { body: dict[str, Any] = {
"model": model, "model": model,
@ -168,7 +168,7 @@ async def analyze(req: AnalyzeRequest) -> AnalyzeResponse:
], ],
} }
url = settings.spark2_vlm_url.rstrip("/") + "/v1/chat/completions" url = settings.vlm_url.rstrip("/") + "/v1/chat/completions"
try: try:
async with httpx.AsyncClient( async with httpx.AsyncClient(
timeout=settings.vision_request_timeout_seconds timeout=settings.vision_request_timeout_seconds
@ -179,7 +179,7 @@ async def analyze(req: AnalyzeRequest) -> AnalyzeResponse:
except httpx.HTTPError as e: except httpx.HTTPError as e:
raise HTTPException( raise HTTPException(
status_code=status.HTTP_502_BAD_GATEWAY, status_code=status.HTTP_502_BAD_GATEWAY,
detail=f"Spark 2 (vLLM) at {url} unreachable: {type(e).__name__}: {e}", detail=f"steev (Ollama) (vLLM) at {url} unreachable: {type(e).__name__}: {e}",
) from e ) from e
try: try:
@ -187,7 +187,7 @@ async def analyze(req: AnalyzeRequest) -> AnalyzeResponse:
except (KeyError, IndexError, TypeError) as e: except (KeyError, IndexError, TypeError) as e:
raise HTTPException( raise HTTPException(
status_code=status.HTTP_502_BAD_GATEWAY, status_code=status.HTTP_502_BAD_GATEWAY,
detail=f"Spark 2 response shape unexpected: {e}", detail=f"steev (Ollama) response shape unexpected: {e}",
) from e ) from e
return parse_scores(text, req.rubric_mode, model) return parse_scores(text, req.rubric_mode, model)

View File

@ -14,14 +14,18 @@ class Settings(BaseSettings):
# Server # Server
svrnty_vision_host: str = "0.0.0.0" svrnty_vision_host: str = "0.0.0.0"
svrnty_vision_port: int = 8090 svrnty_vision_port: int = 8092
# Spark 1 — FLUX (ComfyUI) # FLUX image generation — ComfyUI on gx10-f38f (100.90.100.10, NVIDIA GB10)
spark1_flux_url: str = "http://spark1.lan:8188" # Models required: diffusion_models/flux2_dev_fp8mixed.safetensors
# text_encoders/mistral_3_small_flux2_fp8.safetensors
# vae/flux2-vae.safetensors
flux_url: str = "http://100.90.100.10:8188"
# Spark 2 — Qwen3-VL (vLLM, OpenAI-compatible) # VLM analysis — Qwen3-VL 32B via Ollama on svrnty-steev (Strix Halo, this machine)
spark2_vlm_url: str = "http://spark2.lan:8000" # OpenAI-compatible endpoint; no /v1 suffix here — router appends it.
spark2_vlm_model: str = "Qwen/Qwen3-VL-7B-Instruct" vlm_url: str = "http://100.88.167.87:11434"
vlm_model: str = "qwen3-vl:32b"
# Common # Common
vision_request_timeout_seconds: int = 120 vision_request_timeout_seconds: int = 120

46
tests/conftest.py Normal file
View File

@ -0,0 +1,46 @@
"""Shared fixtures for svrnty-vision test suite."""
from __future__ import annotations
import base64
import io
import pytest
from fastapi.testclient import TestClient
from PIL import Image
from svrnty_vision.server import app
@pytest.fixture(scope="session")
def client() -> TestClient:
return TestClient(app)
@pytest.fixture(scope="session")
def red_png_b64() -> str:
"""100×100 solid-red PNG encoded as base64 — minimal valid image for all endpoints."""
img = Image.new("RGB", (100, 100), color=(220, 50, 50))
buf = io.BytesIO()
img.save(buf, format="PNG")
return base64.b64encode(buf.getvalue()).decode("ascii")
@pytest.fixture(scope="session")
def gradient_png_b64() -> str:
"""200×200 RGB gradient — more realistic for VLM + palette tests."""
img = Image.new("RGB", (200, 200))
for x in range(200):
for y in range(200):
img.putpixel((x, y), (x, y, 128))
buf = io.BytesIO()
img.save(buf, format="PNG")
return base64.b64encode(buf.getvalue()).decode("ascii")
def pytest_configure(config: pytest.Config) -> None:
config.addinivalue_line(
"markers",
"integration: live service tests — require Tailscale + running Spark hosts. "
"Run with: pytest -m integration",
)

View File

@ -43,7 +43,7 @@ def test_render_requires_workflow_or_prompt() -> None:
assert response.status_code == 400 assert response.status_code == 400
def test_render_returns_502_when_spark1_unreachable() -> None: def test_render_returns_502_when_gx10_unreachable() -> None:
class _StubClient: class _StubClient:
def __init__(self, *a, **kw): def __init__(self, *a, **kw):
pass pass

View File

@ -1,4 +1,4 @@
"""Smoke tests for the FastAPI scaffold.""" """Liveness + basic gateway smoke tests."""
from fastapi.testclient import TestClient from fastapi.testclient import TestClient
@ -8,20 +8,32 @@ client = TestClient(app)
def test_healthz_returns_200() -> None: def test_healthz_returns_200() -> None:
response = client.get("/healthz") resp = client.get("/healthz")
assert response.status_code == 200 assert resp.status_code == 200
body = response.json() body = resp.json()
assert body["status"] == "ok" assert body["status"] == "ok"
assert "version" in body assert "version" in body
def test_palette_extract_returns_501() -> None: def test_all_routes_registered() -> None:
# Still a 4a stub — Phase 4c moved only VLM + FLUX, palette/rembg deferred. """Verify all 4 functional endpoints are mounted (not 404)."""
response = client.post("/palette/extract") routes = {r.path for r in app.routes}
assert response.status_code == 501 assert "/vlm/analyze" in routes
assert "/flux/render" in routes
assert "/palette/extract" in routes
assert "/rembg/cutout" in routes
def test_rembg_cutout_returns_501() -> None: def test_vlm_analyze_missing_body_returns_400() -> None:
# Still a 4a stub — Phase 4c moved only VLM + FLUX, palette/rembg deferred. resp = client.post("/vlm/analyze", json={})
response = client.post("/rembg/cutout") assert resp.status_code == 400
assert response.status_code == 501
def test_palette_extract_missing_body_returns_400() -> None:
resp = client.post("/palette/extract", json={})
assert resp.status_code == 400
def test_rembg_cutout_missing_body_returns_400() -> None:
resp = client.post("/rembg/cutout", json={})
assert resp.status_code == 400

View File

@ -0,0 +1,276 @@
"""End-to-end integration tests — hit live Spark hosts via svrnty-vision.
Run with: pytest -m integration -v
Skip by default in CI / offline environments.
Hosts required:
VLM svrnty-steev (Strix Halo) · 100.88.167.87:11434 · qwen3-vl:32b on Ollama
FLUX gx10-f38f · 100.90.100.10:8188 · ComfyUI + flux2_dev_fp8mixed
"""
from __future__ import annotations
import base64
import io
import os
from decimal import Decimal
import httpx
import pytest
from PIL import Image
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
BASE_URL = os.environ.get("SVRNTY_VISION_URL", "http://localhost:8092")
VLM_HOST = "100.88.167.87"
FLUX_HOST = "100.90.100.10"
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _make_png_b64(color: tuple[int, int, int] = (220, 80, 60), size: int = 128) -> str:
img = Image.new("RGB", (size, size), color=color)
buf = io.BytesIO()
img.save(buf, format="PNG")
return base64.b64encode(buf.getvalue()).decode("ascii")
def _host_reachable(host: str, port: int, timeout: float = 2.0) -> bool:
import socket
try:
with socket.create_connection((host, port), timeout=timeout):
return True
except OSError:
return False
# ---------------------------------------------------------------------------
# Skip conditions
# ---------------------------------------------------------------------------
vlm_available = pytest.mark.skipif(
not _host_reachable(VLM_HOST, 11434),
reason=f"VLM host {VLM_HOST}:11434 (svrnty-steev Ollama) not reachable",
)
flux_available = pytest.mark.skipif(
not _host_reachable(FLUX_HOST, 8188),
reason=f"FLUX host {FLUX_HOST}:8188 (gx10 ComfyUI) not reachable",
)
gateway_available = pytest.mark.skipif(
not _host_reachable("127.0.0.1", 8092),
reason="svrnty-vision gateway not running on localhost:8092",
)
# ---------------------------------------------------------------------------
# Gateway health
# ---------------------------------------------------------------------------
@pytest.mark.integration
@gateway_available
def test_gateway_healthz() -> None:
resp = httpx.get(f"{BASE_URL}/healthz", timeout=5)
assert resp.status_code == 200
body = resp.json()
assert body["status"] == "ok"
assert "version" in body
# ---------------------------------------------------------------------------
# VLM — Qwen3-VL 32B on svrnty-steev
# ---------------------------------------------------------------------------
@pytest.mark.integration
@gateway_available
@vlm_available
def test_vlm_analyze_raw_mode_returns_text() -> None:
"""Raw mode: VLM describes the image freely — no score parsing."""
resp = httpx.post(
f"{BASE_URL}/vlm/analyze",
json={
"image_base64": _make_png_b64((220, 80, 60)),
"brand_context": "Describe what you see in this image.",
"rubric_mode": "raw",
},
timeout=60,
)
assert resp.status_code == 200, resp.text
body = resp.json()
assert body["rubric_mode"] == "raw"
assert body["brand_fit_score"] is None
assert isinstance(body["raw_scores_json"], str)
assert len(body["raw_scores_json"]) > 0
assert "qwen" in body["model_id"].lower()
@pytest.mark.integration
@gateway_available
@vlm_available
def test_vlm_analyze_polished_returns_scores() -> None:
"""Polished mode: VLM returns brand_fit + visual_polish 05 scores."""
resp = httpx.post(
f"{BASE_URL}/vlm/analyze",
json={
"image_base64": _make_png_b64((50, 120, 200)),
"brand_context": "Modern tech brand — clean, minimal, confident.",
"rubric_mode": "polished",
},
timeout=120,
)
assert resp.status_code == 200, resp.text
body = resp.json()
assert body["rubric_mode"] == "polished"
brand_fit = Decimal(str(body["brand_fit_score"]))
visual_polish = Decimal(str(body["visual_polish_score"]))
assert Decimal("0") <= brand_fit <= Decimal("5"), f"brand_fit out of range: {brand_fit}"
assert Decimal("0") <= visual_polish <= Decimal("5"), f"visual_polish out of range: {visual_polish}"
assert isinstance(body["justification"], str)
assert len(body["justification"]) > 0
@pytest.mark.integration
@gateway_available
@vlm_available
def test_vlm_analyze_ugc_mode() -> None:
"""UGC mode: same structure as polished, different rubric framing."""
resp = httpx.post(
f"{BASE_URL}/vlm/analyze",
json={
"image_base64": _make_png_b64((80, 180, 80)),
"brand_context": "Fresh food delivery — organic, home-style.",
"rubric_mode": "ugc",
},
timeout=120,
)
assert resp.status_code == 200, resp.text
body = resp.json()
assert body["rubric_mode"] == "ugc"
assert body["brand_fit_score"] is not None
# ---------------------------------------------------------------------------
# FLUX — ComfyUI on gx10-f38f
# ---------------------------------------------------------------------------
@pytest.mark.integration
@gateway_available
@flux_available
def test_flux_render_returns_valid_png() -> None:
"""Minimal FLUX render — 4 steps for speed, verifies PNG round-trip."""
resp = httpx.post(
f"{BASE_URL}/flux/render",
json={
"prompt": "a plain white circle on black background",
"width": 512,
"height": 512,
"steps": 4,
"guidance": 2.5,
},
timeout=300,
)
assert resp.status_code == 200, resp.text
body = resp.json()
assert "image_base64" in body
assert body["content_type"] == "image/png"
assert body["provider"] == "local"
assert isinstance(body["duration_ms"], int)
assert body["duration_ms"] > 0
raw = base64.b64decode(body["image_base64"])
img = Image.open(io.BytesIO(raw))
assert img.width == 512
assert img.height == 512
@pytest.mark.integration
@gateway_available
@flux_available
def test_flux_render_seeds_produce_different_images() -> None:
"""Two renders with different prompts → different images (non-trivial output)."""
def render(prompt: str) -> bytes:
resp = httpx.post(
f"{BASE_URL}/flux/render",
json={"prompt": prompt, "width": 512, "height": 512, "steps": 4},
timeout=300,
)
assert resp.status_code == 200
return base64.b64decode(resp.json()["image_base64"])
img_a = render("solid red background, nothing else")
img_b = render("solid blue background, nothing else")
assert img_a != img_b, "Two different prompts produced identical output — likely cached/deduped"
# ---------------------------------------------------------------------------
# Palette — in-process (Pillow)
# ---------------------------------------------------------------------------
@pytest.mark.integration
@gateway_available
def test_palette_extract_live() -> None:
"""Palette extraction is in-process — always passes when gateway is up."""
resp = httpx.post(
f"{BASE_URL}/palette/extract",
json={"image_base64": _make_png_b64((200, 50, 50)), "color_count": 4},
timeout=10,
)
assert resp.status_code == 200
body = resp.json()
r, g, b = body["dominant"]
assert r > 150, "dominant color should be red-dominant"
assert body["color_count"] <= 4
# ---------------------------------------------------------------------------
# Rembg — in-process
# ---------------------------------------------------------------------------
@pytest.mark.integration
@gateway_available
def test_rembg_cutout_live() -> None:
"""Background removal — always passes when gateway is up (model downloads on first call)."""
resp = httpx.post(
f"{BASE_URL}/rembg/cutout",
json={"image_base64": _make_png_b64()},
timeout=120, # first call downloads u2net ONNX model
)
assert resp.status_code == 200
body = resp.json()
raw = base64.b64decode(body["image_base64"])
img = Image.open(io.BytesIO(raw))
assert img.mode == "RGBA"
# ---------------------------------------------------------------------------
# Error surface — gateway must return correct HTTP codes
# ---------------------------------------------------------------------------
@pytest.mark.integration
@gateway_available
def test_vlm_analyze_missing_image_returns_400() -> None:
resp = httpx.post(
f"{BASE_URL}/vlm/analyze",
json={"brand_context": "test", "rubric_mode": "raw"},
timeout=10,
)
assert resp.status_code == 400
@pytest.mark.integration
@gateway_available
def test_flux_render_missing_prompt_returns_400() -> None:
resp = httpx.post(f"{BASE_URL}/flux/render", json={"width": 512, "height": 512}, timeout=10)
assert resp.status_code == 400

79
tests/test_palette.py Normal file
View File

@ -0,0 +1,79 @@
"""Unit tests for POST /palette/extract."""
from __future__ import annotations
import base64
import io
import pytest
from PIL import Image
from svrnty_vision.routers.palette import PaletteRequest, PaletteResponse, extract
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _make_png_b64(color: tuple[int, int, int], size: int = 50) -> str:
img = Image.new("RGB", (size, size), color=color)
buf = io.BytesIO()
img.save(buf, format="PNG")
return base64.b64encode(buf.getvalue()).decode("ascii")
# ---------------------------------------------------------------------------
# Unit tests (pure function / TestClient — no network)
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_solid_red_dominant_is_red() -> None:
req = PaletteRequest(image_base64=_make_png_b64((255, 0, 0)), color_count=3)
resp: PaletteResponse = await extract(req)
r, g, b = resp.dominant
assert r > 200, "dominant R channel should be high for solid red"
assert g < 80
assert b < 80
@pytest.mark.asyncio
async def test_palette_color_count_respected() -> None:
req = PaletteRequest(image_base64=_make_png_b64((0, 128, 255)), color_count=4)
resp = await extract(req)
assert resp.color_count <= 4
assert len(resp.palette) == resp.color_count
@pytest.mark.asyncio
async def test_palette_each_entry_is_rgb_triple() -> None:
req = PaletteRequest(image_base64=_make_png_b64((100, 200, 50)), color_count=6)
resp = await extract(req)
for entry in resp.palette:
assert len(entry) == 3
assert all(0 <= c <= 255 for c in entry)
@pytest.mark.asyncio
async def test_palette_missing_image_raises_400() -> None:
from fastapi import HTTPException
with pytest.raises(HTTPException) as exc_info:
await extract(PaletteRequest())
assert exc_info.value.status_code == 400
@pytest.mark.asyncio
async def test_palette_bad_base64_raises_400() -> None:
from fastapi import HTTPException
with pytest.raises(HTTPException) as exc_info:
await extract(PaletteRequest(image_base64="!!!notbase64!!!"))
assert exc_info.value.status_code == 400
def test_palette_via_test_client(client, red_png_b64) -> None:
resp = client.post("/palette/extract", json={"image_base64": red_png_b64, "color_count": 5})
assert resp.status_code == 200
body = resp.json()
assert "dominant" in body
assert len(body["dominant"]) == 3
assert body["color_count"] <= 5
assert len(body["palette"]) == body["color_count"]

45
tests/test_rembg.py Normal file
View File

@ -0,0 +1,45 @@
"""Unit tests for POST /rembg/cutout."""
from __future__ import annotations
import base64
import io
import pytest
from PIL import Image
def _make_png_b64(color: tuple[int, int, int] = (180, 100, 50), size: int = 64) -> str:
img = Image.new("RGB", (size, size), color=color)
buf = io.BytesIO()
img.save(buf, format="PNG")
return base64.b64encode(buf.getvalue()).decode("ascii")
def test_rembg_missing_image_returns_400(client) -> None:
resp = client.post("/rembg/cutout", json={})
assert resp.status_code == 400
def test_rembg_bad_base64_returns_400(client) -> None:
resp = client.post("/rembg/cutout", json={"image_base64": "%%%bad%%%"})
assert resp.status_code == 400
def test_rembg_returns_png_with_alpha(client, red_png_b64) -> None:
"""rembg removes background → output is PNG with alpha channel."""
resp = client.post("/rembg/cutout", json={"image_base64": red_png_b64})
assert resp.status_code == 200
body = resp.json()
assert body["content_type"] == "image/png"
assert "image_base64" in body
raw = base64.b64decode(body["image_base64"])
img = Image.open(io.BytesIO(raw))
assert img.mode == "RGBA", f"expected RGBA, got {img.mode}"
def test_rembg_alpha_matting_flag_round_trips(client, red_png_b64) -> None:
resp = client.post("/rembg/cutout", json={"image_base64": red_png_b64, "alpha_matting": False})
assert resp.status_code == 200
assert resp.json()["alpha_matting"] is False

View File

@ -1,7 +1,7 @@
"""Pytest port of BTE's FakeVlmEvaluationParseTests + VlmRubric parse coverage. """Pytest port of BTE's FakeVlmEvaluationParseTests + VlmRubric parse coverage.
These tests cover the pure-function side of the VLM router (rubric prompt + score These tests cover the pure-function side of the VLM router (rubric prompt + score
parsing). The HTTP call to Spark 2 is exercised separately via TestClient with a parsing). The HTTP call to steev (Ollama) is exercised separately via TestClient with a
mocked httpx transport. mocked httpx transport.
""" """
@ -64,11 +64,8 @@ def test_analyze_requires_image_input() -> None:
assert response.status_code == 400 assert response.status_code == 400
def test_analyze_returns_502_when_spark2_unreachable() -> None: def test_analyze_returns_502_when_steev_unreachable() -> None:
"""Smoke: with no Spark 2 (or a failing transport), gateway surfaces 502. """Smoke: with no steev (Ollama) or a failing transport, gateway surfaces 502."""
Uses a mock async client that raises ConnectError on POST.
"""
class _StubClient: class _StubClient:
def __init__(self, *a, **kw): def __init__(self, *a, **kw):
@ -95,8 +92,8 @@ def test_analyze_returns_502_when_spark2_unreachable() -> None:
assert response.status_code == 502 assert response.status_code == 502
def test_analyze_round_trip_with_mocked_spark2() -> None: def test_analyze_round_trip_with_mocked_steev() -> None:
"""Happy path: mock vLLM returns a well-formed score JSON; gateway parses it.""" """Happy path: mocked Ollama returns well-formed score JSON; gateway parses it."""
canned_response = { canned_response = {
"choices": [ "choices": [