89 lines
3.9 KiB
Markdown
89 lines
3.9 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Overview
|
|
|
|
This is a FastFlowLM proxy server setup that runs LLM models on an AMD NPU (Neural Processing Unit). The proxy auto-starts the model on first request and stops it after idle timeout to free RAM.
|
|
|
|
## Architecture
|
|
|
|
- **`flm-proxy.js`** — Node.js HTTP proxy (port 8000) that sits in front of FastFlowLM (port 8001). It lazily spawns `flm.exe`, polls until the model is ready, proxies all requests, and kills the process after 5 minutes of inactivity. Exposes `/status` and `/stop` control endpoints.
|
|
- **`FastFlowLM/flm.exe`** — Pre-built binary that serves OpenAI-compatible API (`/v1/models`, `/v1/chat/completions`, etc.) using NPU-accelerated models. Not source code — do not modify.
|
|
- **`flm-service-install.js` / `flm-service-uninstall.js`** — Install/uninstall the proxy as a Windows service via `node-windows`.
|
|
- **`daemon/`** — Windows service wrapper files generated by `node-windows` (exe, logs, config).
|
|
- **`flm-start.bat` / `flm-stop.bat`** — Simple batch scripts to run FLM directly (bypassing the proxy).
|
|
|
|
## Commands
|
|
|
|
```bash
|
|
# Run the proxy (foreground)
|
|
node flm-proxy.js
|
|
|
|
# Install as Windows service
|
|
node flm-service-install.js
|
|
|
|
# Uninstall Windows service
|
|
node flm-service-uninstall.js
|
|
|
|
# Install dependencies
|
|
npm install
|
|
|
|
# Check service logs
|
|
cat ~/daemon/flmvisionproxy.out.log
|
|
cat ~/daemon/flmvisionproxy.err.log
|
|
```
|
|
|
|
## Key Configuration (in flm-proxy.js)
|
|
|
|
- `MODEL` — currently `qwen2.5vl-it:3b` (Qwen2.5 Vision-Language 3B)
|
|
- `PROXY_PORT` — 8000 (external-facing)
|
|
- `FLM_PORT` — 8001 (internal FLM server)
|
|
- `IDLE_TIMEOUT_MS` — 5 minutes
|
|
- `HOST` — `0.0.0.0` (listens on all interfaces)
|
|
|
|
## Available Models
|
|
|
|
See `FastFlowLM/model_list.json` for the full catalog. Model identifiers use the format `family:size` (e.g., `qwen3:4b`, `llama3.2:3b`). Vision models have `"vlm": true`. Thinking models have `"think": true`.
|
|
|
|
## Services
|
|
|
|
All services are TypeScript/Express apps with the same build pattern:
|
|
|
|
```bash
|
|
cd <ServiceDir>
|
|
npm install # install deps
|
|
npm run build # tsc → dist/
|
|
npm start # node dist/server.js
|
|
npm run dev # tsx watch (hot-reload)
|
|
|
|
# Windows service management
|
|
node service-install.js
|
|
node service-uninstall.js
|
|
```
|
|
|
|
### ImageModerationService (port 8100)
|
|
|
|
Checks uploaded images for NSFW/explicit content using the local vision LLM. When an image is flagged unsafe, fires callbacks to the upload service (to replace the image) and to Parochia (to flag the user).
|
|
|
|
- **Endpoints:** `POST /moderate` (multipart: `file`, `context`, `imagePath`, `userId`, `siteId`), `GET /health`
|
|
- **Vision model:** `gemma3:4b` via FLM proxy at `localhost:8000`
|
|
- **Callbacks:** Configurable in `.env` — upload service replace URL + Parochia moderation callback
|
|
- **Source:** `src/moderate.ts` (moderation logic), `src/server.ts` (Express app)
|
|
|
|
### VisionScannerService (port 8002)
|
|
|
|
Scans shelf/pantry photos to extract product information and prices using the vision LLM. Uses ChromaDB for embeddings storage and Ollama for embedding generation. Supports image tiling for high-res photos.
|
|
|
|
- **Endpoints:** `POST /scan/shelf` (multipart: `image`, `store_name`), `POST /scan/pantry` (multipart: `image`), `GET /health`
|
|
- **Vision model:** `qwen2.5vl-it:3b` via FLM proxy at `localhost:8000`
|
|
- **External deps:** Ollama (`192.168.0.15:11434`, `nomic-embed-text`), ChromaDB (`192.168.0.15:8000`), optional Gemini API
|
|
- **Source:** `src/vision.ts` (LLM calls), `src/tiling.ts` (image tiling), `src/shelf.ts` / `src/pantry.ts` (scan logic), `src/embeddings.ts` + `src/chroma.ts` (vector storage), `src/matching.ts` (product matching), `src/parsing.ts` (response parsing), `src/gemini.ts` (Gemini fallback), `src/config.ts`
|
|
|
|
## Environment
|
|
|
|
- Windows 11, AMD NPU hardware
|
|
- Node.js with `node-windows` dependency
|
|
- FLM binary path: `C:\Users\sshuser\FastFlowLM\flm.exe`
|
|
- All paths are hardcoded to `C:\Users\sshuser\`
|