# Engram Configuration Reference

Complete configuration reference for the Engram MCP memory server.

---

## Table of Contents

1. [Environment Variables](#environment-variables)
2. [Path Configuration](#path-configuration)
3. [Search Configuration](#search-configuration)
4. [MCP Server Limits](#mcp-server-limits)
5. [Memory Tools Configuration](#memory-tools-configuration)
6. [Vector Search Configuration](#vector-search-configuration)
7. [Knowledge Graph Configuration](#knowledge-graph-configuration)
8. [Optional Dependency Groups](#optional-dependency-groups)
9. [Docker Configuration](#docker-configuration)
10. [System Requirements](#system-requirements)
11. [Example .env File](#example-env-file)

---

## Data Directory

Engram separates code from data:

- **Code:** `~/engram/` — the git repository (src, scripts, tests, docs)
- **Data:** `~/.local/share/engram/` — runtime data (entities, databases, logs, telemetry)

You can safely delete and re-clone `~/engram/` without losing any data.

---

## Environment Variables

### Hub Server (`server.py`)

| Variable | Default | Description | Required |
|---|---|---|---|
| `ENGRAM_API_TOKEN` | _(none)_ | Bearer token for API authentication. When unset, the API runs without authentication (local dev mode). | No (recommended for production) |
| `ENGRAM_HOST` | `127.0.0.1` | Bind address for the Hub server. Set to `0.0.0.0` to accept remote connections. | No |
| `ENGRAM_PORT` | `8000` | TCP port for the Hub server. | No |
| `ENGRAM_CORS_ORIGINS` | `http://localhost:3000,http://127.0.0.1:3000` | Comma-separated list of allowed CORS origins. | No |

### WebSocket Gateway (`gateway_websocket.py`)

| Variable | Default | Description | Required |
|---|---|---|---|
| `GATEWAY_TOKEN` | _(none)_ | Authentication token for WebSocket handshake. Clients must send this token during connection. The gateway refuses to start if this is unset. | **Yes** |
| `GATEWAY_PORT` | `8003` | TCP port for the WebSocket gateway. Must be a valid integer. | No |
| `HUB_URL` | `http://localhost:8000` | URL of the Engram Hub server that the gateway connects to. | No |

### Worker API (`worker_api.py`)

| Variable | Default | Description | Required |
|---|---|---|---|
| `ENGRAM_API_TOKEN` | _(none)_ | Shared with Hub server. Controls Bearer token auth for the worker heartbeat API. | No (recommended for production) |

### Docker Compose (`docker-compose.yml`)

| Variable | Default | Description | Required |
|---|---|---|---|
| `MATRIX_REGISTRATION_SECRET` | _(none)_ | Shared secret for Matrix/Synapse user registration. Passed to both the `hub` and `synapse` containers. | Yes (if running Synapse) |
| `DATABASE_URL` | `sqlite:///./engram.db` | SQLAlchemy database URL for the Hub container. Hardcoded in compose but overridable. | No |

---

## Path Configuration

All paths are defined in `src/engram/config.py` and default to subdirectories of `~/.local/share/engram/`.

| Constant | Default Value | Purpose |
|---|---|---|
| `ENGRAM_DATA` | `~/.local/share/engram/` | Root directory for all Engram runtime data. |
| `ENTITIES_DIR` | `~/.local/share/engram/entities/` | Markdown entity files (sessions, reflections, chat recalls). Searched by ripgrep. |
| `TELEMETRY_DIR` | `~/.local/share/engram/telemetry/` | Telemetry and metrics data. |
| `AGENT_MEMORY_DIR` | `~/.local/share/engram/agent_memory/` | Per-agent segregated memory files, organized by role subdirectory (Coding, Planning, Research, Review). |
| `LOG_DIR` | `~/.local/share/engram/logs/` | Application log output. |
| `INDEX_PATH` | `~/.local/share/engram/vault_index.sqlite` | SQLite FTS5 index for metadata-filtered searches (tags, date ranges). |

### Worker Database

The worker heartbeat database is defined in `src/engram/db_models.py`:

| Path | Default Value | Purpose |
|---|---|---|
| Worker DB | `~/.local/share/engram/workers.sqlite` | SQLite database for edge worker heartbeat tracking. |

### Overriding Paths

Paths are currently set as module-level constants derived from `Path.home()`. To override them:

1. **Edit `config.py` directly** -- change `ENGRAM_DATA` and all derived paths follow.
2. **Symlink** -- symlink `~/.local/share/engram/` to your preferred location.
3. **Worker DB** -- pass a custom `db_path` argument to `init_db()` in `db_models.py`.

There are no environment variable overrides for paths at this time.

---

## Search Configuration

### Keyword-to-Topic Mapping (`KEYWORD_TOPICS`)

The Librarian uses regex patterns to auto-tag entities. Defined in `config.py`:

| Regex Pattern | Assigned Topics |
|---|---|
| `memory\|engram\|recall\|vault` | Memory |
| `yoga\|pacific\|timezone` | Yoga, Pacific |
| `cron\|daemon\|export\|sync` | Cron |
| `5950x\|2080\|ram\|rtx` | Hardware |
| `edge\|jetpack\|nvidia` | edge |
| `sandbox\|tool\|execute` | Sandbox |

### Agent Detection Patterns (`AGENT_PATTERNS`)

Used for memory segregation -- routes content to the correct `agent_memory/` subdirectory:

| Regex Pattern | Agent Role |
|---|---|
| `Planning\|planner` | Planning |
| `Code\|coding\|frontend\|backend` | Coding |
| `Review\|auditor\|reviewer` | Review |
| `Research\|hyper\|vault` | Research |

### Alias Map (`ALIAS_MAP`)

Anti-hallucination guardrail. Tags are only applied if the source text contains at least one alias keyword:

| Tag | Valid Aliases |
|---|---|
| edge | edge, nano, nvidia, orin, board, jetpack |
| Sandbox | sandbox, mcp, tool, execute, limit |
| Memory | memory, engram, recall, vault |
| Cron | cron, daemon, export, sync |

### Session ID Pattern

Session IDs are validated against:

```
^[A-Za-z0-9_.-]+$
```

Only alphanumeric characters, underscores, hyphens, and dots are permitted. The Hub server uses a slightly narrower pattern (`^[A-Za-z0-9_-]+$`) that excludes dots.

### CLI Regex

Pattern for detecting CLI command blocks in entity text:

```
`(rg|ls|sqlite3|crontab|git|pip|npm).*?(?=\n\$|\n>|\n---)
```

---

## MCP Server Limits

| Constant | Value | Location | Description |
|---|---|---|---|
| `MAX_QUERY_LENGTH` | 500 | `config.py` | Maximum characters in a search query (MCP tool). |
| `MAX_QUERY_LENGTH` | 1000 | `server.py` | Maximum characters in a search query (HTTP API). Note: the server overrides the config value. |
| `MAX_RESPONSE_CHARS` | 4000 | `config.py` | Maximum characters returned in a single search response. |

---

## Memory Tools Configuration

Phase 1 introduced time-decay importance scoring for memory entries. These settings control how quickly memories fade and when they are pruned.

| Variable | Default | Description |
|---|---|---|
| `IMPORTANCE_HALF_LIFE_DAYS` | `30.0` | Half-life in days for exponential time-decay of memory importance scores. A memory's effective importance halves every N days. Lower values cause faster decay; higher values preserve old memories longer. |
| `IMPORTANCE_PRUNE_THRESHOLD` | `0.05` | Minimum effective importance score (after decay) below which memories are eligible for pruning. Memories that decay below this threshold may be removed during cleanup. Range: 0.0 to 1.0. |

---

## Vector Search Configuration

Phase 2 added optional vector (semantic) search alongside the existing keyword/ripgrep search. When enabled, memories are embedded into a vector store for similarity-based retrieval.

| Variable | Default | Description |
|---|---|---|
| `VECTOR_ENABLED` | `False` | Enable vector-based semantic search. When `False`, only keyword search is used. |
| `VECTOR_MODEL` | `BAAI/bge-small-en-v1.5` | Sentence-transformer model used for embedding text. Must be compatible with the `sentence-transformers` library. |
| `VECTOR_DIMS` | `384` | Dimensionality of the embedding vectors. Must match the output dimensions of `VECTOR_MODEL`. |

### Installation

Vector search requires additional dependencies:

```bash
pip install engram[vector]
```

This installs `sentence-transformers` and its transitive dependencies (torch, transformers, etc.). The first run will download the embedding model (~130 MB for `bge-small-en-v1.5`).

---

## Knowledge Graph Configuration

Phase 3 added an optional knowledge graph that extracts entities and relationships from memories using an LLM, then stores them in a KuzuDB graph database.

| Variable | Default | Description |
|---|---|---|
| `GRAPH_ENABLED` | `False` | Enable knowledge graph extraction and querying. When `False`, graph features are not loaded. |
| `GRAPH_LLM_PROVIDER` | `anthropic` | LLM provider for entity/relationship extraction. Supported values: `anthropic`, `openai`. |
| `GRAPH_LLM_MODEL` | `claude-sonnet-4-20250514` | Model used for graph entity extraction. Must be available under the chosen provider. |
| `ANTHROPIC_API_KEY` | _(none)_ | API key for Anthropic. **Required** when `GRAPH_LLM_PROVIDER=anthropic`. |
| `OPENAI_API_KEY` | _(none)_ | API key for OpenAI. **Required** when `GRAPH_LLM_PROVIDER=openai`. |

### Graph Database Location

The graph database is stored at:

```
ENGRAM_DATA / "knowledge_graph.kuzu"
```

By default this resolves to `~/.local/share/engram/knowledge_graph.kuzu`. The directory is created automatically on first use.

### Installation

Knowledge graph support requires additional dependencies:

```bash
pip install engram[graph]
```

This installs `kuzu` (embedded graph database) and the LLM provider SDK(s) needed for entity extraction.

---

## Optional Dependency Groups

Engram uses Python optional dependency groups to keep the core install lightweight. Install only what you need:

| Install Command | What You Get |
|---|---|
| `pip install -e .` | Core only: MCP server, keyword search, librarian, time-decay scoring. |
| `pip install -e ".[vector]"` | Core + vector/semantic search (sentence-transformers). |
| `pip install -e ".[graph]"` | Core + knowledge graph (KuzuDB, LLM extraction). |
| `pip install -e ".[vector,graph]"` | Core + both vector search and knowledge graph. |
| `pip install -e ".[dev]"` | Core + development/test tools (pytest, pytest-asyncio). |
| `pip install -e ".[clipboard]"` | Core + clipboard integration (pyperclip). |

Multiple groups can be combined freely, e.g.:

```bash
pip install -e ".[vector,graph,dev]"
```

---

## Docker Configuration

### Services

The `docker-compose.yml` defines two services:

#### Hub (`engram-hub`)

| Setting | Value |
|---|---|
| Container name | `engram-hub` |
| Exposed port | `8000:8000` |
| Health check | `GET http://localhost:8000/health` (10s interval, 5s timeout, 3 retries) |
| Restart policy | `unless-stopped` |

Volumes:

| Host Path | Container Path | Purpose |
|---|---|---|
| `./src/engram` | `/app/src/engram` | Source code (live reload) |
| `./engram.db` | `/app/engram.db` | SQLite database |
| `./config` | `/app/config` | Configuration files |
| `./entities` | `/app/entities` | Entity markdown files |
| `./telemetry` | `/app/telemetry` | Telemetry data |

#### Synapse (`engram-synapse`)

| Setting | Value |
|---|---|
| Image | `matrixdotorg/synapse:latest` |
| Container name | `engram-synapse` |
| Exposed ports | `8008:8008` (client API), `8448:8448` (federation) |
| Server name | `engram.local` |
| Health check | `GET http://localhost:8008/_synapse/admin/v1/server_version` (15s interval) |
| Restart policy | `unless-stopped` |
| GPU | Requests 1 NVIDIA GPU (optional, uses `deploy.resources.reservations`) |

Volumes:

| Host Path | Container Path | Purpose |
|---|---|---|
| `./matrix` | `/data` | Synapse persistent data |
| `./docker/synapse-entrypoint.sh` | `/entrypoint.sh` (read-only) | Custom entrypoint script |

### Network

All services share the `engram-mesh` bridge network.

### Port Summary

| Port | Service | Protocol |
|---|---|---|
| 8000 | Engram Hub (FastAPI) | HTTP |
| 8003 | WebSocket Gateway | WebSocket |
| 8008 | Matrix/Synapse (client API) | HTTP |
| 8448 | Matrix/Synapse (federation) | HTTPS |

---

## System Requirements

### Required

| Dependency | Minimum Version | Purpose |
|---|---|---|
| Python | 3.10+ | Runtime. Tested on 3.10, 3.11, 3.12. |
| ripgrep (`rg`) | Any recent | Full-text search across entity files. The server returns HTTP 500 if `rg` is not on `PATH`. |
| SQLite | 3.35+ with FTS5 | Metadata index (`vault_index.sqlite`) and worker database. FTS5 is required for tag/date filtering. |

### Python Dependencies

Core (from `pyproject.toml`):

```
mcp
pydantic>=2.0
pyyaml>=6.0
```

Hub server (additional):

```
fastapi
uvicorn
sqlalchemy
```

WebSocket gateway (additional):

```
websockets
```

### Optional

| Dependency | Purpose |
|---|---|
| Docker + Docker Compose | Containerized deployment of Hub and Synapse. |
| NVIDIA GPU + drivers | GPU passthrough for Synapse container. Not required for Hub. |
| pyperclip | Clipboard integration (`pip install engram[clipboard]`). |
| slowapi | Rate limiting for Hub API endpoints (install separately). |

### Development

```
pip install engram[dev]
```

Installs: `pytest>=7.0`, `pytest-asyncio`, `types-PyYAML`.

---

## Example .env File

Save as `.env` in the project root. Docker Compose and shell scripts will source it automatically.

```shell
# =============================================================================
# Engram Environment Configuration
# =============================================================================

# --- Hub Server ---------------------------------------------------------------

# Bearer token for API authentication (leave empty for unauthenticated local dev)
ENGRAM_API_TOKEN=

# Bind address and port
ENGRAM_HOST=127.0.0.1
ENGRAM_PORT=8000

# CORS allowed origins (comma-separated)
ENGRAM_CORS_ORIGINS=http://localhost:3000,http://127.0.0.1:3000

# --- WebSocket Gateway --------------------------------------------------------

# REQUIRED: Authentication token for WebSocket clients
GATEWAY_TOKEN=change-me-to-a-secure-random-string

# Gateway port
GATEWAY_PORT=8003

# Hub URL the gateway connects to
HUB_URL=http://localhost:8000

# --- Memory Tools (Phase 1) ---------------------------------------------------

# Time-decay half-life in days (default: 30.0)
IMPORTANCE_HALF_LIFE_DAYS=30.0

# Minimum importance score before pruning (default: 0.05)
IMPORTANCE_PRUNE_THRESHOLD=0.05

# --- Vector Search (Phase 2) --------------------------------------------------

# Enable semantic vector search (default: False)
VECTOR_ENABLED=False

# Embedding model (default: BAAI/bge-small-en-v1.5)
VECTOR_MODEL=BAAI/bge-small-en-v1.5

# Embedding dimensions — must match the model (default: 384)
VECTOR_DIMS=384

# --- Knowledge Graph (Phase 3) ------------------------------------------------

# Enable knowledge graph extraction (default: False)
GRAPH_ENABLED=False

# LLM provider for entity extraction: anthropic or openai
GRAPH_LLM_PROVIDER=anthropic

# Model for entity extraction
GRAPH_LLM_MODEL=claude-sonnet-4-20250514

# API key for the chosen provider (uncomment one)
# ANTHROPIC_API_KEY=sk-ant-...
# OPENAI_API_KEY=sk-...

# --- Matrix / Synapse ---------------------------------------------------------

# Shared secret for Synapse user registration
MATRIX_REGISTRATION_SECRET=change-me-to-a-secure-random-string

# --- Docker -------------------------------------------------------------------

# Database URL (used inside Hub container)
DATABASE_URL=sqlite:///./engram.db
```

---

## Entry Points

Defined in `pyproject.toml`:

| Command | Target | Description |
|---|---|---|
| `engram-serve` | `engram.server:main` | Start the MCP server. |
| `engram-index` | `engram.librarian:main` | Run the Librarian indexer. |

### Running the Hub Server directly

```bash
# Default: localhost:8000
python -m engram.server

# Custom host/port
ENGRAM_HOST=0.0.0.0 ENGRAM_PORT=9000 python -m engram.server

# With authentication
ENGRAM_API_TOKEN=my-secret python -m engram.server
```

### Running the WebSocket Gateway

```bash
GATEWAY_TOKEN=my-secret python -m engram.gateway_websocket
```

### Running with Docker Compose

```bash
cp .env.example .env   # edit values first
docker compose up -d
```