# Engram: Evolution of a Complete Memory System

**Last updated:** 2026-04-04

This document traces how Engram evolved from a simple session search
tool into a comprehensive memory system for autonomous AI agents.
Each section describes what was lacking, what we studied, what we
built, and why.

---

## Chapter 1: Where Engram Started

Engram began as a **session recovery tool**. After a catastrophic
memory exhaustion event that corrupted 49,000+ session files from
AI agent (an AI agent), Engram was built to recover, index, and search
those files.

**Original capabilities (v0.1.0, March 2026):**
- ripgrep full-text search across markdown files
- SQLite FTS5 metadata index (topics, dates, summaries)
- YAML frontmatter for structured metadata
- Generator-based librarian for batch indexing
- Anti-hallucination guardrail (verifies topic tags against file content)
- Dream Cycle consolidator (merges duplicate files, prunes old ones)

**What it was good at:**
- Fast keyword search (~45ms across 49K files)
- Recovery and ingestion of corrupted session data
- Deduplication of overlapping session transcripts

**What it lacked:**
- No way for agents to save/update/delete memories (read-only)
- No semantic search (couldn't find "edge memory issues" if
  those exact words weren't in the file)
- No knowledge graph (no entity relationships)
- No versioning (updates destroyed previous state)
- No encryption (everything plaintext)
- No biological memory patterns (all memories treated equally)
- No reflection or knowledge generation
- No automated research monitoring

---

## Chapter 2: Security Hardening (2026-04-03)

**Problem identified:** Code review revealed multiple security
vulnerabilities including ripgrep flag injection, timing-unsafe
token comparison, and servers binding to 0.0.0.0.

**What we did:**
- Added `--` separator before ripgrep patterns (prevents flag injection)
- Subprocess timeouts (30s server, 10s guardrail)
- Timing-safe token comparison everywhere (hmac.compare_digest)
- WebSocket gateway + worker API bound to 127.0.0.1
- SQLite busy timeout for concurrent access
- Credential rotation (Matrix secret, API tokens)
- Bearer token authentication on all API endpoints
- CORS middleware with configurable origins

**Principle established:** Security is non-negotiable. Every
subprocess call gets a timeout, every comparison is timing-safe,
every binding is loopback unless explicitly configured otherwise.

---

## Chapter 3: Self-Editing Memory (Phase 1, 2026-04-03)

**Problem:** Agents could search Engram but couldn't write to it.
Memory was a passive archive, not an active workspace. Agents had
no way to save insights, update facts, or clean up stale data.

**What we studied:** Letta/MemGPT's self-editing memory pattern
where agents explicitly manage their own context via tool calls.
The key insight: agents should be active managers of their memory,
not passive consumers.

**What we built (zero new dependencies):**
- `save_memory` — write new entity with topics, importance, summary
- `update_memory` — merge-update existing entity fields
- `delete_memory` — soft-delete to archive with audit trail
- `search_memory` — unified keyword + optional semantic search
- `consolidate_memory` — trigger Dream Cycle on demand
- `get_memory_stats` — vault health metrics
- Importance scoring with time-decay (importance.py)

**Design decision:** MCP tools using FastMCP, not a custom API.
This means any MCP-compatible agent (Claude, AI agent, Gemini CLI)
can manage Engram's memory without custom integration code.

---

## Chapter 4: Semantic Search (Phase 2, 2026-04-03)

**Problem:** ripgrep finds exact text matches. But "conversations
about edge memory issues" wouldn't match a file containing
"Orin Nano RAM constraints" even though they're about the same
thing. Keyword search misses conceptual connections.

**What we studied:** The agent memory ecosystem — Mem0, Hindsight,
ChromaDB, sqlite-vec, FastEmbed. The conclusion: sqlite-vec extends
our existing SQLite database without a new database, and FastEmbed
generates embeddings locally without API keys.

**What we built:**
- sqlite-vec virtual table in existing vault_index.sqlite
- FastEmbed ONNX embeddings (BAAI/bge-small-en-v1.5, 384 dims)
- `semantic_search` via `search_memory(semantic=True)`
- `engram-vectorize` CLI for bulk indexing all 49K entities
- ~75MB vector data, <100ms KNN search, 100% local

**Design decision:** sqlite-vec over ChromaDB because it extends
our existing database rather than adding a parallel one. FastEmbed
over sentence-transformers because ONNX Runtime is lighter than
PyTorch (~150MB vs ~700MB). Both optional via `pip install engram[vector]`.

---

## Chapter 5: Knowledge Graph — First Attempt and Rebuild (Phase 3 + Phase 0, 2026-04-03/04)

**Problem:** Engram had no concept of entity relationships. "Alice
works on Engram" and "Engram runs on the edge" were just text —
no structured understanding that Alice → works_on → Engram →
deployed_on → edge.

**First attempt:** Integrated Graphiti (Zep's temporal knowledge
graph) with Kuzu embedded database. Worked, but pulled in 15+
transitive dependencies including posthog (telemetry that phones
home to an external server).

**Supply chain concern:** A recent npm supply chain attack made us
reconsider. Graphiti-core could inject arbitrary code via PyPI if
compromised. Any of its 15+ transitive deps could too.

**Principle established:** Understand what a library does, then
build it ourselves. No blind dependency adoption. This became the
foundational rule for all future development.

**What Graphiti actually did (that we needed):**
1. Entity extraction — LLM prompt asking "extract entities and
   relationships from this text, return JSON"
2. Graph storage — nodes and edges with timestamps
3. Graph search — find entities by name or similarity
4. Temporal awareness — edges have valid_from/valid_to

**What we rebuilt from scratch (~500 lines):**
- `kg_entities` and `kg_edges` tables in existing memory_store.db
  (shared with Hermes holographic memory)
- Direct Anthropic API calls for entity extraction
- SQL queries for graph search and relationship traversal
- No graphiti-core, no kuzu, no neo4j, no posthog

**Dependencies removed:** graphiti-core, kuzu (and all their
transitive deps). Dependencies kept: anthropic SDK (already
approved, already used by Claude Code itself).

---

## Chapter 6: Time-Travel Memory (2026-04-04)

**Problem:** When a memory was updated, the previous version was
destroyed. When consolidated, fragments were archived but the
pre-merge state was gone. There was no way to answer "what did I
know about X last Tuesday?"

**What we studied:** Memvid's Smart Frame architecture — append-only
immutable frames inspired by video encoding. Every state is a
snapshot that can be rewound. The key insight: memory should be an
append-only timeline, not a mutable document.

**What we built:**
- `entity_versions` table in vault_index.sqlite
- Version snapshots stored in `entities/.versions/{id}.md`
- Every `save_memory` creates an initial version
- Every `update_memory` snapshots the OLD content before overwriting
- Every `delete_memory` captures the final state
- `memory_at(path, timestamp)` — time-travel query
- `memory_history(path)` — full version timeline
- `prune_versions(path, max=50)` — bounded storage

**Design decision:** Snapshots stored as files (not SQLite BLOBs)
so they're independently readable, greppable, and recoverable.
Metadata in SQLite for fast timestamp queries. Content hashes for
integrity verification.

---

## Chapter 7: Encryption at Rest (2026-04-04)

**Problem:** All memories stored as plaintext markdown. If the
machine is compromised, every memory is immediately readable —
session transcripts, project details, credentials references, etc.

**What we studied:** Memvid's built-in encryption and the Python
cryptography library (Fernet — AES-128-CBC + HMAC-SHA256).

**What we built:**
- Optional Fernet encryption for version snapshots
- `ENGRAM_ENCRYPTION_KEY` env var or key file activation
- `engram-keygen` CLI generates key with mode 600
- Encrypted snapshots stored as `.enc` (binary, unreadable)
- Transparent decrypt on read via `get_version()`
- `encrypt_file` / `decrypt_file` for SQLite databases at rest

**Design decision:** Encryption is opt-in via feature flag. Without
a key, Engram works exactly as before. The `cryptography` package
was accepted despite being a third-party dep because it's the most
audited crypto library in Python (50K+ stars, used by pip itself).

---

## Chapter 8: Biological Memory Patterns (Planned, 2026-04-04)

**Problem:** All memories decay at the same rate regardless of type.
A quick conversation and a hard-won insight both fade equally. There's
no mechanism for the agent to generate NEW knowledge from existing
memories — consolidation only compresses, never creates.

**What we studied:** CLUDE.io's five-type memory architecture with
per-type decay rates and automated reflection cycles. The key
insights: (1) different memory types should persist for different
durations, (2) consolidation should generate knowledge, not just
compress it.

**What we plan to build:**
- Typed memory decay: episodic (10d half-life), semantic (60d),
  procedural (45d), project (90d), self_model (365d)
- Reflection phase in Dream Cycle — LLM generates patterns,
  insights, and questions from recent memories
- Self-model auto-generation — persistent description of the
  agent's understanding of itself and the user
- Conflict resolution — detect contradictory memories, resolve
  by keeping the stronger/more recent one

**Status:** Plan written, 8 atomic tasks defined, not yet built.

---

## Chapter 9: Research Monitoring (Forge, 2026-04-04)

**Problem:** The AI landscape moves fast. Papers like Google's
TurboQuant (3-bit KV cache, 6x memory reduction) can be directly
applicable to our system but we'd miss them without active monitoring.

**What we built (separate project at ~/forge):**
- Sentinel agent: scans arXiv, HuggingFace, RSS feeds, X/Twitter
  every 4 hours for new research across 15 topics
- Scout Agent: deep research using Scout MCP (citation verification,
  novelty checking, structured reports)
- Architect: Claude API planning with atomic task decomposition
- Builder: Ollama local execution for code changes
- Auditor: security + quality review before merge
- Chronicler: 12-hour digest summaries
- Morning briefing dashboard at localhost:8080
- File-based queue (JSON, no database, debuggable with `ls`)
- Supply chain security gate (approved_deps.yaml)
- Cron scheduling: Sentinel 4h, Chronicler 12h, Runner 30min

**Design decision:** Forge is a separate project, not inside Engram.
Engram = memory, Scout = research tools, Forge = the pipeline that
orchestrates both. They communicate via MCP and file I/O, not
shared code. If any one breaks, the others still work.

---

## Architecture Timeline

```
March 2026 (v0.1.0):
  ripgrep + SQLite FTS5 + YAML frontmatter
  = Session search tool

April 3, 2026 (v0.2.0):
  + Security hardening
  + Self-editing memory (6 MCP tools)
  + Semantic search (sqlite-vec + FastEmbed)
  + Knowledge graph (custom SQLite, no Graphiti)
  = Active memory system with three search modes

April 4, 2026 (v0.3.0):
  + Time-travel (version snapshots)
  + Encryption at rest (Fernet)
  + Forge research pipeline (6 agents)
  = Complete memory system with history, security,
    and autonomous research monitoring

Planned (v0.4.0):
  + Typed memory decay
  + Reflection cycle (knowledge generation)
  + Self-model
  + Conflict resolution
  = Biologically-inspired memory system
```

---

## Guiding Principles (Established Through This Evolution)

1. **Own every line.** Understand what a library does, build it
   ourselves. No blind dependency adoption. (Chapter 5)

2. **Security is non-negotiable.** Timeouts, timing-safe comparison,
   loopback binding, input validation. Always. (Chapter 2)

3. **Optional layers, graceful degradation.** Vector search, graph,
   encryption are all opt-in. Core works without them. (Chapters 4-7)

4. **Worktrees for safety.** Never modify the live system directly.
   All feature work in isolated git worktrees. (Chapters 5-8)

5. **Files over databases for portability.** Markdown + YAML
   frontmatter is universal. SQLite for indexes, not primary
   storage. (All chapters)

6. **MCP for interop.** Standard protocol, not proprietary SDK.
   Any agent can use Engram without custom integration. (Chapter 3)

7. **Test everything.** 148+ tests. Every feature has tests. Every
   worktree passes the full suite before merge. (All chapters)

8. **Separate concerns.** Engram = memory. Scout = research.
   Forge = orchestration. Three small projects > one monolith.
   (Chapter 9)
