diff --git a/README.md b/README.md
new file mode 100644
index 0000000..4e1c692
--- /dev/null
+++ b/README.md
@@ -0,0 +1,82 @@
+# Vision Scanner Service
+
+TypeScript/Express service that scans shelf and pantry photos to extract product information and prices using a local vision LLM running on an AMD NPU. Uses ChromaDB for vector embeddings storage and Ollama for embedding generation. Supports image tiling for high-resolution photos.
+
+## How It Works
+
+1. A photo of a store shelf or pantry is uploaded
+2. The image is tiled into smaller sections for better accuracy on high-res photos
+3. Each tile is sent to the vision LLM (qwen2.5vl-it:3b via FLM proxy) for product extraction
+4. Extracted products are matched against existing entries using vector embeddings (ChromaDB + Ollama)
+5. Optionally enriched via Gemini API as a fallback
+
+## Endpoints
+
+| Method | Path | Description |
+|--------|------|-------------|
+| `POST` | `/scan/shelf` | Scan a store shelf photo (multipart: `image`, `store_name`) |
+| `POST` | `/scan/pantry` | Scan a pantry photo (multipart: `image`) |
+| `POST` | `/enrich/product` | Extract detailed product info from a single product image |
+| `GET` | `/health` | Health check (reports status of vision model, Ollama, ChromaDB) |
+
+## Configuration
+
+All configuration is via environment variables (`.env` file):
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `PORT` | `8002` | Service port |
+| `VISION_AI_URL` | `http://localhost:8000/v1/chat/completions` | Vision LLM endpoint |
+| `VISION_AI_MODEL` | `qwen2.5vl-it:3b` | Vision model to use |
+| `VISION_AI_TIMEOUT` | `120000` | Timeout for vision LLM calls (ms) |
+| `OLLAMA_HOST` | `http://192.168.0.15:11434` | Ollama server for embeddings |
+| `OLLAMA_EMBED_MODEL` | `nomic-embed-text` | Embedding model |
+| `CHROMA_HOST` | `http://192.168.0.15:8000` | ChromaDB server |
+| `GEMINI_API_KEY` | — | Optional Gemini API key for fallback |
+| `GEMINI_MODEL` | `gemini-2.5-flash` | Gemini model for fallback |
+| `MAX_CONCURRENT_TILES` | `4` | Max parallel tile processing |
+| `UPLOAD_DIR` | `uploads` | Temporary upload directory |
+
+## Usage
+
+```bash
+npm install        # Install dependencies
+npm run build      # Compile TypeScript → dist/
+npm start          # Run the service
+npm run dev        # Development mode with hot-reload
+
+# Windows service
+node service-install.js
+node service-uninstall.js
+```
+
+## Project Structure
+
+```
+src/
+  server.ts      — Express app, routes
+  config.ts      — Configuration from environment
+  vision.ts      — Vision LLM API calls
+  tiling.ts      — Image tiling for high-res photos
+  shelf.ts       — Shelf scanning logic
+  pantry.ts      — Pantry scanning logic
+  enrich.ts      — Product info enrichment
+  parsing.ts     — LLM response parsing
+  embeddings.ts  — Ollama embedding generation
+  chroma.ts      — ChromaDB vector storage
+  matching.ts    — Product matching via embeddings
+  gemini.ts      — Gemini API fallback
+```
+
+## External Dependencies
+
+- **FLM Proxy** (localhost:8000) — Vision LLM inference on AMD NPU
+- **Ollama** (192.168.0.15:11434) — Embedding generation with `nomic-embed-text`
+- **ChromaDB** (192.168.0.15:8000) — Vector database for product embeddings
+- **Gemini API** (optional) — Fallback for product enrichment
+
+## Environment
+
+- **OS:** Windows 11, AMD NPU hardware
+- **Runtime:** Node.js + TypeScript
+- **Vision LLM:** qwen2.5vl-it:3b served by FLM proxy on localhost:8000