# Engram API Reference

Version 2.0.0

Engram exposes four interfaces: an **MCP tool set** (stdio transport, 10 tools across 3 phases) for AI agents, a **Hub API** (FastAPI/HTTP) for direct vault queries, a **Worker API** for edge fleet management, and a **WebSocket Gateway** for real-time dashboard communication.

---

## Table of Contents

- [Authentication](#authentication)
- [MCP Tool Interface](#mcp-tool-interface)
- [Hub API](#hub-api)
- [Worker API](#worker-api)
- [WebSocket Gateway](#websocket-gateway)
- [Error Codes](#error-codes)
- [MCP Tools (Self-Editing Memory, Semantic Search, Knowledge Graph)](#mcp-tools-self-editing-memory-semantic-search-knowledge-graph)
  - [Phase 1 — Self-Editing Memory](#phase-1--self-editing-memory) (save, update, delete, search, consolidate, stats)
  - [Phase 2 — Semantic Search](#phase-2--semantic-search) (search_memory with semantic=True, engram-vectorize CLI)
  - [Phase 3 — Knowledge Graph](#phase-3--knowledge-graph) (graph_query, graph_ingest, graph_relationships)
- [Importance Scoring](#importance-scoring)

---

## Authentication

### Environment Variables

| Variable           | Used By                      | Required | Description                                      |
|--------------------|------------------------------|----------|--------------------------------------------------|
| `ENGRAM_API_TOKEN`  | Hub API, Worker API          | No       | Bearer token for HTTP endpoints                  |
| `GATEWAY_TOKEN`     | WebSocket Gateway            | Yes      | Token for WebSocket handshake authentication     |

### Bearer Token (Hub API and Worker API)

When `ENGRAM_API_TOKEN` is set, all protected endpoints require a Bearer token in the `Authorization` header:

```
Authorization: Bearer <your-token>
```

When `ENGRAM_API_TOKEN` is **not** set, all endpoints are accessible without authentication (local development mode). A warning is logged at startup.

### Protected vs. Unprotected Endpoints

| Endpoint                          | Auth Required |
|-----------------------------------|---------------|
| Hub: `GET /`                      | No            |
| Hub: `GET /health`                | No            |
| Hub: `POST /search`              | Yes           |
| Hub: `GET /status`               | Yes           |
| Hub: `GET /docs`                 | No            |
| Worker: `GET /health`            | No            |
| Worker: `POST /api/v1/workers/heartbeat` | Yes    |
| Worker: `GET /api/v1/workers`    | Yes           |
| Worker: `GET /api/v1/workers/{hostname}` | Yes    |
| Worker: `POST /api/v1/workers/cleanup`   | Yes    |
| WebSocket Gateway                | Yes (via handshake token) |

---

## MCP Tool Interface

The primary interface for AI agents. Communicates over **stdio** using the Model Context Protocol.

### Tool: `get_session_context`

Search the Engram vault (entity archive of session files). Returns matching text from `*.md` entity files using ripgrep.

**Input Schema (from `SessionSearchInput` Pydantic model):**

| Field        | Type            | Required | Default | Constraints                              | Description                                          |
|--------------|-----------------|----------|---------|------------------------------------------|------------------------------------------------------|
| `query`      | `string`        | No*      | `""`    | Max 500 characters; regex syntax allowed | Search term. Empty string valid only with other filters. |
| `session_id` | `string \| null` | No       | `null`  | Pattern: `^[A-Za-z0-9_.-]+$`            | Filename prefix to scope the search (glob filter).   |
| `tags`       | `list[string] \| null` | No | `null`  | --                                       | Filter by topic tags (requires metadata index).      |
| `date_from`  | `string \| null` | No       | `null`  | Format: `YYYY-MM-DD`                    | ISO-8601 date lower bound.                           |
| `date_to`    | `string \| null` | No       | `null`  | Format: `YYYY-MM-DD`                    | ISO-8601 date upper bound.                           |

*At least one of `query`, `session_id`, `tags`, `date_from`, or `date_to` must be provided. Omitting all fields returns a validation error.

**Output:** A `CallToolResult` containing a `TextContent` object with:
- On success: ripgrep output (file paths and matching lines), truncated to 4000 characters.
- On no match: `"No match for '<query>'. <stderr>"`
- On validation error: `"Validation error: <details>"`
- On missing ripgrep: `"ripgrep (rg) is not installed or not on PATH."`

### Client Configuration

**Claude Desktop** -- add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):

```json
{
  "mcpServers": {
    "engram": {
      "command": "engram-serve"
    }
  }
}
```

**Claude Code** -- add to `.mcp.json` in your project root:

```json
{
  "mcpServers": {
    "engram": {
      "command": "engram-serve",
      "type": "stdio"
    }
  }
}
```

**AI agent** -- add to your AI agent config:

```yaml
extensions:
  engram:
    command: engram-serve
    type: stdio
```

After configuration, your agent can call `get_session_context(query="memory")` to search the vault.

---

## Hub API

FastAPI application running on `http://127.0.0.1:8000` by default. Interactive docs available at `/docs` (Swagger UI) and `/openapi.json`.

**Configuration environment variables:**

| Variable              | Default               | Description                     |
|-----------------------|-----------------------|---------------------------------|
| `ENGRAM_HOST`         | `127.0.0.1`           | Bind address                    |
| `ENGRAM_PORT`         | `8000`                | Listen port                     |
| `ENGRAM_API_TOKEN`    | *(unset)*             | Bearer token for auth           |
| `ENGRAM_CORS_ORIGINS` | `http://localhost:3000,http://127.0.0.1:3000` | Comma-separated allowed origins |

### GET /

Dashboard endpoint. Returns service information and available endpoints.

**Auth:** None

**Response:**

```json
{
  "service": "Engram Hub",
  "status": "online",
  "version": "1.0.0",
  "description": "Sovereign knowledge vault with Tailscale mesh coordination",
  "endpoints": {
    "health": "/health",
    "status": "/status",
    "search": "/search (POST)",
    "docs": "/docs",
    "openapi": "/openapi.json"
  }
}
```

**Example:**

```bash
curl http://localhost:8000/
```

---

### GET /health

Health check. Returns operational status, timestamp, and version.

**Auth:** None

**Response (`HealthResponse`):**

| Field       | Type     | Description               |
|-------------|----------|---------------------------|
| `status`    | `string` | Always `"online"`         |
| `timestamp` | `string` | UTC ISO-8601 timestamp    |
| `version`   | `string` | Server version (`"1.0.0"`) |

**Example:**

```bash
curl http://localhost:8000/health
```

```json
{
  "status": "online",
  "timestamp": "2026-04-03T12:00:00.000000",
  "version": "1.0.0"
}
```

---

### POST /search

Search the Engram vault for sessions matching a query and/or metadata filters.

**Auth:** Bearer token (when `ENGRAM_API_TOKEN` is set)

**Request body (`SessionSearchInput`):**

| Field        | Type            | Required | Constraints                              | Description                                          |
|--------------|-----------------|----------|------------------------------------------|------------------------------------------------------|
| `query`      | `string`        | No*      | Max 500 chars; must be valid regex       | Text search term.                                    |
| `session_id` | `string \| null` | No       | Pattern: `^[A-Za-z0-9_.-]+$`            | Glob filter on filenames.                            |
| `tags`       | `list[string] \| null` | No | --                                       | Metadata tag filter (requires SQLite index).         |
| `date_from`  | `string \| null` | No       | `YYYY-MM-DD`                             | Date lower bound (metadata index).                   |
| `date_to`    | `string \| null` | No       | `YYYY-MM-DD`                             | Date upper bound (metadata index).                   |

*At least one field must be provided.

**Response (`SearchResponse`):**

| Field        | Type     | Description                                |
|--------------|----------|--------------------------------------------|
| `query`      | `string` | The query that was executed                |
| `results`    | `string` | Matching text (truncated to 4000 chars)    |
| `elapsed_ms` | `float`  | Query execution time in milliseconds       |

**Behavior:**

- Text query with no metadata filters: runs `rg --no-heading --with-filename <query> <entities_dir>`.
- Text query with `session_id`: adds `--glob *<session_id>*` to restrict filenames.
- Text query with `tags`/`date_from`/`date_to`: resolves candidate files from the SQLite metadata index, then runs ripgrep only on those files.
- Metadata-only query (tags/dates without text query): returns first 500 chars of each matching file.
- Invalid regex in `query` returns 400.
- Query exceeding 1000 characters in the HTTP endpoint (500 via Pydantic) returns 400.

**Examples:**

```bash
# Simple text search
curl -X POST http://localhost:8000/search \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $ENGRAM_API_TOKEN" \
  -d '{"query": "memory"}'

# Search scoped to a session
curl -X POST http://localhost:8000/search \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $ENGRAM_API_TOKEN" \
  -d '{"query": "edge", "session_id": "Session_42"}'

# Metadata-only: filter by tags
curl -X POST http://localhost:8000/search \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $ENGRAM_API_TOKEN" \
  -d '{"tags": ["edge"], "date_from": "2026-03-01", "date_to": "2026-03-31"}'

# Regex search
curl -X POST http://localhost:8000/search \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $ENGRAM_API_TOKEN" \
  -d '{"query": "tailscale.*mesh|headscale"}'
```

---

### GET /status

Detailed vault status including file paths and configuration.

**Auth:** Bearer token (when `ENGRAM_API_TOKEN` is set)

**Response:**

| Field           | Type     | Description                         |
|-----------------|----------|-------------------------------------|
| `service`       | `string` | `"engram-hub"`                      |
| `status`        | `string` | `"online"`                          |
| `entities_path` | `string` | Absolute path to entities directory |
| `index_path`    | `string` | Absolute path to SQLite index       |
| `timestamp`     | `string` | UTC ISO-8601 timestamp              |

**Example:**

```bash
curl http://localhost:8000/status \
  -H "Authorization: Bearer $ENGRAM_API_TOKEN"
```

```json
{
  "service": "engram-hub",
  "status": "online",
  "entities_path": "/path/to/engram/entities",
  "index_path": "/path/to/engram/vault_index.sqlite",
  "timestamp": "2026-04-03T12:00:00.000000"
}
```

---

## Worker API

FastAPI application for managing edge worker fleet heartbeats. Uses a SQLite database (`~/engram/workers.sqlite`) for persistence.

### POST /api/v1/workers/heartbeat

Register or update a edge worker heartbeat. Creates the worker if it does not exist; updates status to `"online"` if it does.

**Auth:** Bearer token (when `ENGRAM_API_TOKEN` is set)

**Request body (`WorkerHeartbeatRequest`):**

| Field          | Type     | Required | Description                          |
|----------------|----------|----------|--------------------------------------|
| `hostname`     | `string` | Yes      | edge hostname (unique identifier)  |
| `tailscale_ip` | `string` | Yes      | Tailscale MagicDNS IP address        |
| `local_ip`     | `string` | No       | Optional local network IP            |
| `ram_gb`       | `float`  | Yes      | Total RAM in GB                      |
| `gpu_type`     | `string` | Yes      | GPU type (e.g., `"edge device"`) |

**Response (`HeartbeatResponse`):**

| Field       | Type     | Description                                |
|-------------|----------|--------------------------------------------|
| `status`    | `string` | `"accepted"`                               |
| `worker_id` | `int`    | Database ID of the worker                  |
| `message`   | `string` | Confirmation message                       |

**Example:**

```bash
curl -X POST http://localhost:8001/api/v1/workers/heartbeat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $ENGRAM_API_TOKEN" \
  -d '{
    "hostname": "edge-nano-01",
    "tailscale_ip": "100.64.0.5",
    "local_ip": "192.168.1.50",
    "ram_gb": 8.0,
    "gpu_type": "edge device"
  }'
```

```json
{
  "status": "accepted",
  "worker_id": 1,
  "message": "Heartbeat received from edge-nano-01"
}
```

---

### GET /api/v1/workers

List all registered workers with their current status.

**Auth:** Bearer token (when `ENGRAM_API_TOKEN` is set)

**Response:** Array of `WorkerResponse` objects:

| Field             | Type       | Description                                |
|-------------------|------------|--------------------------------------------|
| `id`              | `int`      | Database ID                                |
| `hostname`        | `string`   | edge hostname                            |
| `tailscale_ip`    | `string`   | Tailscale IP address                       |
| `status`          | `string`   | `"online"`, `"offline"`, or `"unhealthy"`  |
| `gpu_type`        | `string`   | GPU type                                   |
| `ram_gb`          | `float`    | Total RAM in GB                            |
| `last_heartbeat`  | `datetime` | Last heartbeat timestamp (UTC)             |
| `created_at`      | `datetime` | Worker registration timestamp (UTC)        |

**Example:**

```bash
curl http://localhost:8001/api/v1/workers \
  -H "Authorization: Bearer $ENGRAM_API_TOKEN"
```

---

### GET /api/v1/workers/{hostname}

Get details for a specific worker by hostname.

**Auth:** Bearer token (when `ENGRAM_API_TOKEN` is set)

**Path parameters:**

| Parameter  | Type     | Description        |
|------------|----------|--------------------|
| `hostname` | `string` | Worker hostname    |

**Response:** Single `WorkerResponse` object (same schema as list endpoint).

**Example:**

```bash
curl http://localhost:8001/api/v1/workers/edge-nano-01 \
  -H "Authorization: Bearer $ENGRAM_API_TOKEN"
```

Returns `404` if the hostname is not found.

---

### POST /api/v1/workers/cleanup

Mark workers as offline if their last heartbeat exceeds a staleness threshold.

**Auth:** Bearer token (when `ENGRAM_API_TOKEN` is set)

**Query parameters:**

| Parameter | Type  | Default | Description                              |
|-----------|-------|---------|------------------------------------------|
| `minutes` | `int` | `60`    | Staleness threshold in minutes           |

**Response (`CleanupResponse`):**

| Field                   | Type     | Description                        |
|-------------------------|----------|------------------------------------|
| `status`                | `string` | `"completed"`                      |
| `workers_marked_offline`| `int`    | Number of workers marked offline   |

**Example:**

```bash
# Mark workers offline if no heartbeat in 30 minutes
curl -X POST "http://localhost:8001/api/v1/workers/cleanup?minutes=30" \
  -H "Authorization: Bearer $ENGRAM_API_TOKEN"
```

```json
{
  "status": "completed",
  "workers_marked_offline": 2
}
```

---

### GET /health (Worker API)

Basic health check for the Worker API.

**Auth:** None

**Response:**

```json
{
  "status": "healthy"
}
```

**Example:**

```bash
curl http://localhost:8001/health
```

---

## WebSocket Gateway

Real-time bidirectional communication for the OpenClaw Nerve Dashboard. Uses the `websockets` library (not FastAPI).

**Default port:** 8003 (configurable via `GATEWAY_PORT`)

### Configuration

| Variable        | Default                   | Required | Description                        |
|-----------------|---------------------------|----------|------------------------------------|
| `GATEWAY_TOKEN` | *(none)*                  | Yes      | Authentication token for handshake |
| `GATEWAY_PORT`  | `8003`                    | No       | WebSocket listen port              |
| `HUB_URL`       | `http://localhost:8000`   | No       | Engram Hub URL for proxying        |

### Connection Protocol

**Step 1: Connect**

```
ws://localhost:8003/
```

**Step 2: Authenticate (within 5 seconds)**

Send a JSON handshake message:

```json
{
  "token": "<GATEWAY_TOKEN value>"
}
```

**Step 3: Receive handshake response**

On success:

```json
{
  "type": "handshake",
  "status": "connected",
  "timestamp": "2026-04-03T12:00:00.000000",
  "agents": {
    "librarian": "online",
    "consolidator": "online",
    "hub": "online"
  },
  "gateway_version": "1.0.0"
}
```

On failure (invalid token):

```json
{
  "type": "error",
  "message": "Invalid authentication token"
}
```

On timeout (no handshake within 5s):

```json
{
  "type": "error",
  "message": "Handshake timeout"
}
```

### Message Types

All messages are JSON objects with a `type` field.

#### Client to Server

**ping** -- Heartbeat check:

```json
{"type": "ping"}
```

Response:

```json
{
  "type": "pong",
  "timestamp": "2026-04-03T12:00:00.000000"
}
```

**status** -- Get agent and connection status:

```json
{"type": "status"}
```

Response:

```json
{
  "type": "status",
  "agents": {
    "librarian": "online",
    "consolidator": "online",
    "hub": "online"
  },
  "connected_clients": 3,
  "timestamp": "2026-04-03T12:00:00.000000"
}
```

**agent_command** -- Send a command to a specific agent:

```json
{
  "type": "agent_command",
  "agent": "librarian",
  "command": "reindex"
}
```

Response:

```json
{
  "type": "agent_response",
  "agent": "librarian",
  "command": "reindex",
  "status": "executed",
  "result": {"message": "Command 'reindex' sent to librarian"},
  "timestamp": "2026-04-03T12:00:00.000000"
}
```

#### Server to Client (Errors)

Unknown message types return:

```json
{
  "type": "error",
  "message": "Unknown message type: <type>"
}
```

Invalid JSON returns:

```json
{
  "type": "error",
  "message": "Invalid JSON"
}
```

### WebSocket Server Parameters

| Parameter        | Value | Description                             |
|------------------|-------|-----------------------------------------|
| `ping_interval`  | 20s   | Server sends WebSocket ping frames      |
| `ping_timeout`   | 10s   | Close connection if pong not received   |

### Example (Python)

```python
import asyncio
import json
import websockets

async def connect():
    async with websockets.connect("ws://localhost:8003/") as ws:
        # Authenticate
        await ws.send(json.dumps({"token": "your-gateway-token"}))
        handshake = json.loads(await ws.recv())
        print("Connected:", handshake)

        # Ping
        await ws.send(json.dumps({"type": "ping"}))
        pong = json.loads(await ws.recv())
        print("Pong:", pong)

        # Get status
        await ws.send(json.dumps({"type": "status"}))
        status = json.loads(await ws.recv())
        print("Status:", status)

asyncio.run(connect())
```

---

## Error Codes

### HTTP Status Codes (Hub API and Worker API)

| Code | Meaning                | When Returned                                                        |
|------|------------------------|----------------------------------------------------------------------|
| 200  | OK                     | Successful request                                                   |
| 400  | Bad Request            | Invalid regex in query; query too long; invalid `session_id` format; invalid date format; missing all filter fields |
| 401  | Unauthorized           | Missing or invalid Bearer token (when `ENGRAM_API_TOKEN` is set)     |
| 404  | Not Found              | No matching sessions (metadata-only query); worker hostname not found |
| 422  | Unprocessable Entity   | Pydantic validation failure on request body (FastAPI default)        |
| 500  | Internal Server Error  | ripgrep not installed; database error during heartbeat processing    |
| 503  | Service Unavailable    | Metadata index (`vault_index.sqlite`) not available for metadata-only queries |

### WebSocket Close Codes

| Code | Meaning              | When Used                           |
|------|----------------------|-------------------------------------|
| 1008 | Policy Violation     | Invalid JSON in handshake message   |

### MCP Tool Error Responses

The MCP tool does not use HTTP status codes. Errors are returned as `TextContent` in the `CallToolResult`:

| Error Pattern                            | Cause                                     |
|------------------------------------------|-------------------------------------------|
| `"Validation error: ..."`               | Pydantic validation failure on input      |
| `"No match for '<query>'. ..."`         | ripgrep returned no results               |
| `"ripgrep (rg) is not installed..."`    | `rg` binary not found on PATH             |
| `"Unknown tool"`                         | Tool name is not `get_session_context`    |

---

## Data Limits

| Limit                  | Value  | Configured In       |
|------------------------|--------|---------------------|
| Max query length       | 500    | `engram.config`     |
| Max response chars     | 4000   | `engram.config`     |
| Max HTTP query length  | 1000   | `server.py`         |
| Session ID pattern     | `^[A-Za-z0-9_.-]+$` | `engram.models` |
| Date format            | `YYYY-MM-DD`         | `engram.models` |
| Metadata snippet size  | 500 chars per file   | `server.py`     |

---

## MCP Tools (Self-Editing Memory, Semantic Search, Knowledge Graph)

Engram exposes 9 MCP tools across three phases. All tools use the same stdio transport as `get_session_context` above.

---

### Phase 1 — Self-Editing Memory

#### Tool: `save_memory`

Save a new entity to the vault.

**Parameters:**

| Name         | Type     | Required | Default | Description                                        |
|--------------|----------|----------|---------|----------------------------------------------------|
| `content`    | `string` | Yes      | —       | Full text content of the memory entity             |
| `topics`     | `list[string]` | Yes | —       | Topic tags for indexing and retrieval              |
| `importance` | `float`  | No       | *auto*  | Override importance score (0.0–1.0). When omitted, computed automatically via `compute_importance_score`. |
| `summary`    | `string` | No       | `""`    | One-line summary used in search result previews    |

**Return:** JSON object:

```json
{
  "path": "entities/<generated-filename>.md",
  "importance": 0.72,
  "topics": ["edge", "deployment"],
  "summary": "edge device provisioning steps"
}
```

---

#### Tool: `update_memory`

Update an existing entity in-place. Only the fields you provide are changed; omitted fields are preserved.

**Parameters:**

| Name         | Type     | Required | Default | Description                                        |
|--------------|----------|----------|---------|----------------------------------------------------|
| `path`       | `string` | Yes      | —       | Relative path to entity file (e.g., `entities/foo.md`) |
| `content`    | `string` | No       | *keep*  | Replacement content                                |
| `topics`     | `list[string]` | No | *keep*  | Replacement topic tags                             |
| `importance` | `float`  | No       | *keep*  | Override importance score (0.0–1.0)                |
| `summary`    | `string` | No       | *keep*  | Replacement summary                                |

**Return:** JSON object:

```json
{
  "path": "entities/foo.md",
  "updated_fields": ["content", "topics"],
  "importance": 0.72
}
```

Returns an error if `path` does not exist.

---

#### Tool: `delete_memory`

Soft-delete (archive) or hard-delete an entity.

**Parameters:**

| Name      | Type     | Required | Default | Description                                              |
|-----------|----------|----------|---------|----------------------------------------------------------|
| `path`    | `string` | Yes      | —       | Relative path to entity file                             |
| `archive` | `bool`   | No       | `True`  | When `True`, moves to `entities/.archive/`. When `False`, permanently deletes the file. |

**Return:** JSON object:

```json
{
  "path": "entities/foo.md",
  "action": "archived",
  "archive_path": "entities/.archive/foo.md"
}
```

Or for hard delete:

```json
{
  "path": "entities/foo.md",
  "action": "deleted"
}
```

---

#### Tool: `search_memory`

Unified search across the vault. Supports text, tag filtering, and optional semantic (vector) search.

**Parameters:**

| Name       | Type            | Required | Default | Description                                              |
|------------|-----------------|----------|---------|----------------------------------------------------------|
| `query`    | `string`        | Yes      | —       | Search query (text or natural language for semantic mode) |
| `tags`     | `list[string]`  | No       | `null`  | Filter results to entities matching these topic tags     |
| `limit`    | `int`           | No       | `10`    | Maximum number of results to return                      |
| `semantic` | `bool`          | No       | `False` | When `True`, uses vector KNN search (Phase 2). When `False`, uses ripgrep text search. |

**Return:** JSON object:

```json
{
  "query": "edge deployment",
  "mode": "semantic",
  "results": [
    {
      "path": "entities/Chatrecall_Session_42.md",
      "score": 0.91,
      "summary": "edge device provisioning steps",
      "snippet": "First 200 chars of matching content..."
    }
  ],
  "count": 1,
  "elapsed_ms": 12.4
}
```

The `score` field is a cosine similarity (0.0–1.0) in semantic mode, or a relevance rank in text mode.

---

#### Tool: `consolidate_memory`

Trigger a Dream Cycle: merge, deduplicate, and re-score entities in the vault.

**Parameters:**

| Name           | Type           | Required | Default | Description                                              |
|----------------|----------------|----------|---------|----------------------------------------------------------|
| `tags`         | `list[string]` | No       | `null`  | Scope consolidation to entities matching these tags. When `null`, consolidates the entire vault. |
| `max_entities` | `int`          | No       | `100`   | Maximum number of entities to process in one cycle       |

**Return:** JSON object:

```json
{
  "processed": 47,
  "merged": 5,
  "archived": 3,
  "rescored": 47,
  "elapsed_ms": 2340.0
}
```

---

#### Tool: `get_memory_stats`

Return vault statistics.

**Parameters:** None.

**Return:** JSON object:

```json
{
  "total_entities": 312,
  "total_archived": 28,
  "by_topic": {
    "edge": 45,
    "deployment": 38,
    "memory": 22
  },
  "avg_importance": 0.54,
  "vault_size_bytes": 1048576,
  "last_consolidation": "2026-04-03T08:00:00.000000"
}
```

---

### Phase 2 — Semantic Search

Semantic search is accessed through the `search_memory` tool (above) by setting `semantic=True`. Under the hood this uses a vector index built from entity embeddings.

#### CLI: `engram-vectorize`

Bulk-index all entities into the vector store. Run this after importing entities or when the index is stale.

```bash
# Index all entities
engram-vectorize

# Re-index from scratch (drops existing index)
engram-vectorize --rebuild

# Index only entities matching a glob
engram-vectorize --glob "entities/Chatrecall_*.md"
```

| Flag        | Default | Description                                      |
|-------------|---------|--------------------------------------------------|
| `--rebuild` | `False` | Drop and recreate the vector index               |
| `--glob`    | `"entities/*.md"` | Glob pattern to select which entities to index |

Output:

```
Indexed 312 entities in 4.2s
Vector store: /path/to/engram/vectors/
```

---

### Phase 3 — Knowledge Graph

#### Tool: `graph_query`

Query the knowledge graph using natural language. Returns entities and relationships matching the question.

**Parameters:**

| Name       | Type     | Required | Default | Description                                              |
|------------|----------|----------|---------|----------------------------------------------------------|
| `question` | `string` | Yes      | —       | Natural language question (e.g., "What connects edge to Tailscale?") |
| `limit`    | `int`    | No       | `10`    | Maximum number of result triples to return               |

**Return:** JSON object:

```json
{
  "question": "What connects edge to Tailscale?",
  "triples": [
    {
      "subject": "edge device",
      "predicate": "connects_via",
      "object": "Tailscale VPN",
      "weight": 0.88,
      "source": "entities/Chatrecall_Session_42.md"
    }
  ],
  "count": 1,
  "elapsed_ms": 45.2
}
```

---

#### Tool: `graph_ingest`

Extract entities and relationships from a memory file and add them to the knowledge graph.

**Parameters:**

| Name   | Type     | Required | Default | Description                                              |
|--------|----------|----------|---------|----------------------------------------------------------|
| `path` | `string` | Yes      | —       | Relative path to entity file (e.g., `entities/foo.md`)   |

**Return:** JSON object:

```json
{
  "path": "entities/foo.md",
  "entities_extracted": 4,
  "relationships_extracted": 6,
  "elapsed_ms": 120.5
}
```

Returns an error if `path` does not exist.

---

#### Tool: `graph_relationships`

Get all relationships for a named entity in the knowledge graph.

**Parameters:**

| Name          | Type     | Required | Default | Description                                              |
|---------------|----------|----------|---------|----------------------------------------------------------|
| `entity_name` | `string` | Yes      | —       | Name of the entity to look up (case-insensitive)         |

**Return:** JSON object:

```json
{
  "entity": "edge device",
  "relationships": [
    {
      "predicate": "connects_via",
      "target": "Tailscale VPN",
      "weight": 0.88,
      "source": "entities/Chatrecall_Session_42.md"
    },
    {
      "predicate": "runs_on",
      "target": "Ubuntu 22.04",
      "weight": 0.95,
      "source": "entities/Chatrecall_Session_15.md"
    }
  ],
  "count": 2
}
```

Returns an empty `relationships` array if the entity is not found in the graph.

---

## Importance Scoring

Engram automatically scores entity importance using `compute_importance_score`. The score is a float from 0.0 to 1.0.

### Formula

```
importance = clamp(
    w_length   * length_signal
  + w_topics   * topic_signal
  + w_links    * link_signal
  + w_recency  * recency_signal
  + w_access   * access_signal,
  0.0, 1.0
)
```

### Signal Components

| Signal           | Weight (`w_*`) | Calculation                                                                 |
|------------------|----------------|-----------------------------------------------------------------------------|
| `length_signal`  | 0.15           | `min(char_count / 2000, 1.0)` — longer content scores higher, capped at 2000 chars |
| `topic_signal`   | 0.25           | `min(topic_count / 5, 1.0)` — more topic tags indicate higher cross-referenceability |
| `link_signal`    | 0.20           | `min(internal_link_count / 3, 1.0)` — entities referencing other entities score higher |
| `recency_signal` | 0.25           | `max(1.0 - (days_since_modified / 90), 0.0)` — decays linearly over 90 days         |
| `access_signal`  | 0.15           | `min(access_count / 10, 1.0)` — frequently accessed entities score higher            |

### How It Is Used

- **save_memory**: Auto-computes importance when `importance` is not provided.
- **consolidate_memory** (Dream Cycle): Re-scores all processed entities using current signal values.
- **search_memory**: Results are ranked by `score * importance` in semantic mode, and by `importance` as a tiebreaker in text mode.
- **delete_memory**: Entities with importance below a configurable threshold (default `0.1`) may be auto-archived during consolidation.

### Overriding

Pass an explicit `importance` value (0.0–1.0) to `save_memory` or `update_memory` to override the automatic score. The override persists until the next Dream Cycle re-scores the entity.
