diff --git a/README.md b/README.md index 4f71429..ebd2cfd 100644 --- a/README.md +++ b/README.md @@ -1,62 +1,60 @@ -# Chromadb Admin +# chromadb-admin -Admin UI for the Chroma embedding database, built with Next.js +Local deployment of [chromadb-admin](https://github.com/flanker/chromadb-admin) — a Next.js admin UI for [ChromaDB](https://docs.trychroma.com), the vector database used by **ScraperControl** for semantic search and deduplication. -![screely-1696786774071](https://github.com/flanker/chromadb-admin/assets/109811/6d4369d4-d10c-49f7-8342-89849f271dbe) +## Services -## Links: +| Service | Port | Description | +|---------|------|-------------| +| `chromadb` | `8000` | ChromaDB vector database | +| `chromadb-admin` | `3002` | Next.js admin UI (not currently running) | -* GitHub Repo: [https://github.com/flanker/chromadb-admin](https://github.com/flanker/chromadb-admin) -* Chroma Official Website [https://docs.trychroma.com](https://docs.trychroma.com) +## ChromaDB Configuration -## Authentication Support +- **Config file:** `/opt/chromadb/config.yaml` +- **Data directory:** `/opt/chromadb/data` (bind-mounted) +- **Persistence:** Enabled (`IS_PERSISTENT=TRUE`) +- **Telemetry:** Disabled (`ANONYMIZED_TELEMETRY=FALSE`) -image +## ScraperControl Collections -## Run Locally +ScraperControl uses 5 ChromaDB collections: -First, start the development server: +| Collection | Purpose | Document format | +|------------|---------|-----------------| +| `church_identity` | Church deduplication | `{name} {address} {city} {country}` | +| `search_results` | FreeSearch result matching | `{title} {snippet} {url}` | +| `page_classification` | Content classification | Page text (first 2000 chars) | +| `schedule_sections` | Mass schedule detection | Text blocks with mass times | +| `page_snapshots` | Change detection | Full page text | + +Embeddings are generated by **Ollama** (`nomic-embed-text`, 274 MB) running at `http://192.168.0.241:11434`. + +## Running ```bash -npm run dev -# or -yarn dev -# or -pnpm dev -# or -bun dev +# Start ChromaDB only +docker compose up -d chromadb + +# Start everything including admin UI +docker compose up -d ``` -THen, open [http://localhost:3001](http://localhost:3001) in your browser to see the app. +Access the admin UI at `http://192.168.0.241:3002` (when running). -## Run with Docker +Connect using: `http://192.168.0.241:8000` -Run +## Deployment -```bash -docker run -p 3001:3001 fengzhichao/chromadb-admin -``` +Runs on `192.168.0.241` (albert-MacBookPro Linux server). -and visit https://localhost:3001⁠ in the browser. +**Docker Compose:** `/opt/docker/chromadb/docker-compose.yml` -*NOTE*: Use `http://host.docker.internal:8000` for the connection string if you want to connect to a ChromaDB instance running locally. +| Endpoint | URL | +|----------|-----| +| ChromaDB API | `http://192.168.0.241:8000` | +| Admin UI | `http://192.168.0.241:3002` (when running) | -## Build and Run with Docker locally +## Upstream -Build the Docker image: - -```bash -docker build -t chromadb-admin . -``` - -Run the Docker container: - -```bash -docker run -p 3001:3001 chromadb-admin -``` - -## Note - -This is NOT an official Chroma project. - -This project is licensed under the terms of the MIT license. +This is a local deployment of the open-source [flanker/chromadb-admin](https://github.com/flanker/chromadb-admin) project (MIT license), with a custom `docker-compose.yml` for this server.