# Engram Data Separation Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Move all runtime data out of `~/engram/` (code repo) into `~/.local/share/engram/` (data directory) so code and data are fully separated.

**Architecture:** Update the central `config.py` to define `ENGRAM_DATA` pointing at `~/.local/share/engram/`, then fix 7 scripts with hardcoded paths, write a one-time migration script, and update docs.

**Tech Stack:** Python (pathlib), Bash (migration script)

**Spec:** `docs/superpowers/specs/2026-04-07-engram-data-separation-design.md`

---

### File Map

| Action | File | Responsibility |
|--------|------|----------------|
| Modify | `src/engram/config.py` | Central path constants — rename ENGRAM_ROOT → ENGRAM_DATA, update path |
| Modify | `src/engram/db_models.py` | Fix hardcoded workers.sqlite path |
| Modify | `src/engram/run_worker_api.py` | Fix hardcoded workers.sqlite path |
| Modify | `scripts/engram_watchdog.py` | Fix local ENGRAM_ROOT definition |
| Modify | `scripts/interview.py` | Fix local ENGRAM_ROOT — split code vs data paths |
| Modify | `scripts/hyperloop.py` | Fix local ENGRAM_ROOT definition |
| Modify | `scripts/find_dupes.py` | Fix hardcoded absolute paths |
| Modify | `scripts/persistent_ingest.py` | Fix hardcoded source default |
| Modify | `CONFIGURATION.md` | Update path documentation |
| Modify | `ARCHITECTURE.md` | Update path documentation |
| Create | `scripts/migrate-data.sh` | One-time migration script |

---

### Task 1: Update config.py — Central Path Constants

**Files:**
- Modify: `src/engram/config.py:12-17,71,110,124,132-135`

This is the highest-impact change. 12+ modules import from here and will automatically use the new paths.

- [ ] **Step 1: Rename ENGRAM_ROOT to ENGRAM_DATA and update the base path**

In `src/engram/config.py`, replace lines 12-17:

```python
# Before:
ENGRAM_ROOT: Path = Path.home() / "engram"
ENTITIES_DIR: Path = ENGRAM_ROOT / "entities"
TELEMETRY_DIR: Path = ENGRAM_ROOT / "telemetry"
AGENT_MEMORY_DIR: Path = ENGRAM_ROOT / "agent_memory"
LOG_DIR: Path = ENGRAM_ROOT / "logs"
INDEX_PATH: Path = ENGRAM_ROOT / "vault_index.sqlite"

# After:
ENGRAM_DATA: Path = Path.home() / ".local" / "share" / "engram"
ENTITIES_DIR: Path = ENGRAM_DATA / "entities"
TELEMETRY_DIR: Path = ENGRAM_DATA / "telemetry"
AGENT_MEMORY_DIR: Path = ENGRAM_DATA / "agent_memory"
LOG_DIR: Path = ENGRAM_DATA / "logs"
INDEX_PATH: Path = ENGRAM_DATA / "vault_index.sqlite"
```

- [ ] **Step 2: Update all remaining ENGRAM_ROOT references in config.py**

Replace lines 71 and 110 (comments):
```python
# Before:
# Pipeline state files (use ENGRAM_ROOT, not CWD)

# After:
# Pipeline state files (use ENGRAM_DATA, not CWD)
```

Replace line 124:
```python
# Before:
ENCRYPTION_KEY_DIR: Path = ENGRAM_ROOT / ".encryption"

# After:
ENCRYPTION_KEY_DIR: Path = ENGRAM_DATA / ".encryption"
```

Replace lines 132-135:
```python
# Before:
LIBRARIAN_STATE: Path = ENGRAM_ROOT / "librarian_state.json"
SEGREGATION_STATE: Path = ENGRAM_ROOT / "segregation_state.json"
PRUNED_DICTIONARY: Path = ENGRAM_ROOT / "PRUNED_DICTIONARY.md"
MASTER_PROFILE: Path = ENGRAM_ROOT / "MASTER_PROFILE.md"

# After:
LIBRARIAN_STATE: Path = ENGRAM_DATA / "librarian_state.json"
SEGREGATION_STATE: Path = ENGRAM_DATA / "segregation_state.json"
PRUNED_DICTIONARY: Path = ENGRAM_DATA / "PRUNED_DICTIONARY.md"
MASTER_PROFILE: Path = ENGRAM_DATA / "MASTER_PROFILE.md"
```

- [ ] **Step 3: Fix all imports across src/engram/ that reference ENGRAM_ROOT**

These files import `ENGRAM_ROOT` by name and need the import updated to `ENGRAM_DATA`:

`src/engram/encryption.py` line 14:
```python
# Before:
from engram.config import ENGRAM_ROOT
# After:
from engram.config import ENGRAM_DATA
```
And line 18:
```python
# Before:
KEY_DIR = ENGRAM_ROOT / ".encryption"
# After:
KEY_DIR = ENGRAM_DATA / ".encryption"
```

`src/engram/consolidator.py` line 21:
```python
# Before:
    ENGRAM_ROOT,
# After:
    ENGRAM_DATA,
```
And line 493:
```python
# Before:
        self_model_path = ENGRAM_ROOT / "SELF_MODEL.md"
# After:
        self_model_path = ENGRAM_DATA / "SELF_MODEL.md"
```

`src/engram/graph.py` line 20:
```python
# Before:
    ENGRAM_ROOT,
# After:
    ENGRAM_DATA,
```
And line 32:
```python
# Before:
DEFAULT_DB_PATH = ENGRAM_ROOT / "memory_store.db"
# After:
DEFAULT_DB_PATH = ENGRAM_DATA / "memory_store.db"
```

`src/engram/server.py` line 23:
```python
# Before:
    ENGRAM_ROOT,
# After:
    ENGRAM_DATA,
```
And line 213:
```python
# Before:
        usage = shutil.disk_usage(str(ENGRAM_ROOT))
# After:
        usage = shutil.disk_usage(str(ENGRAM_DATA))
```

`src/engram/metrics.py` line 14:
```python
# Before:
from engram.config import ENGRAM_ROOT, ENTITIES_DIR, INDEX_PATH, LOG_DIR
# After:
from engram.config import ENGRAM_DATA, ENTITIES_DIR, INDEX_PATH, LOG_DIR
```
And line 152:
```python
# Before:
        usage = shutil.disk_usage(str(ENGRAM_ROOT))
# After:
        usage = shutil.disk_usage(str(ENGRAM_DATA))
```
And line 212:
```python
# Before:
        from engram.config import ENGRAM_ROOT as _root
# After:
        from engram.config import ENGRAM_DATA as _root
```

`src/engram/lint.py` lines 391 and 467:
```python
# Before:
    root = engram_root or Path(config.ENGRAM_ROOT)
# After:
    root = engram_root or Path(config.ENGRAM_DATA)
```

- [ ] **Step 4: Verify no ENGRAM_ROOT references remain in src/engram/**

Run: `grep -r "ENGRAM_ROOT" ~/engram/src/engram/`
Expected: No output (zero matches)

- [ ] **Step 5: Commit**

```bash
cd ~/engram
git add src/engram/config.py src/engram/encryption.py src/engram/consolidator.py \
  src/engram/graph.py src/engram/server.py src/engram/metrics.py src/engram/lint.py
git commit -m "refactor: rename ENGRAM_ROOT to ENGRAM_DATA, point at ~/.local/share/engram"
```

---

### Task 2: Fix Hardcoded workers.sqlite Paths

**Files:**
- Modify: `src/engram/db_models.py:50`
- Modify: `src/engram/run_worker_api.py:44`

- [ ] **Step 1: Fix db_models.py**

In `src/engram/db_models.py`, add import at top and fix line 50:

Add to imports:
```python
from engram.config import ENGRAM_DATA
```

Replace line 50:
```python
# Before:
            db_path = str(Path.home() / "engram" / "workers.sqlite")
# After:
            db_path = str(ENGRAM_DATA / "workers.sqlite")
```

- [ ] **Step 2: Fix run_worker_api.py**

In `src/engram/run_worker_api.py`, add import at top and fix line 44:

Add to imports:
```python
from engram.config import ENGRAM_DATA
```

Replace line 44:
```python
# Before:
        db_path = Path.home() / "engram" / "workers.sqlite"
# After:
        db_path = ENGRAM_DATA / "workers.sqlite"
```

- [ ] **Step 3: Verify no hardcoded "engram" / "workers" paths remain**

Run: `grep -r "workers.sqlite" ~/engram/src/`
Expected: Only the two files above, now using ENGRAM_DATA

- [ ] **Step 4: Commit**

```bash
cd ~/engram
git add src/engram/db_models.py src/engram/run_worker_api.py
git commit -m "fix: use ENGRAM_DATA for workers.sqlite paths"
```

---

### Task 3: Fix Scripts With Local ENGRAM_ROOT Definitions

**Files:**
- Modify: `scripts/engram_watchdog.py:44-50`
- Modify: `scripts/interview.py:63-69`
- Modify: `scripts/hyperloop.py:44-45`

These scripts define their own `ENGRAM_ROOT` instead of importing from config. Update them to use the new data path.

- [ ] **Step 1: Fix engram_watchdog.py**

Replace lines 44-50:
```python
# Before:
ENGRAM_ROOT = Path.home() / "engram"
ENTITIES_DIR = ENGRAM_ROOT / "entities"
LOG_DIR = ENGRAM_ROOT / "logs"
WATCHDOG_LOG = LOG_DIR / "watchdog.jsonl"
WATCHDOG_STATE = LOG_DIR / "watchdog_state.json"
WATCHER_STATE = LOG_DIR / "watcher_state.json"
INDEX_PATH = ENGRAM_ROOT / "vault_index.sqlite"

# After:
ENGRAM_DATA = Path.home() / ".local" / "share" / "engram"
ENTITIES_DIR = ENGRAM_DATA / "entities"
LOG_DIR = ENGRAM_DATA / "logs"
WATCHDOG_LOG = LOG_DIR / "watchdog.jsonl"
WATCHDOG_STATE = LOG_DIR / "watchdog_state.json"
WATCHER_STATE = LOG_DIR / "watcher_state.json"
INDEX_PATH = ENGRAM_DATA / "vault_index.sqlite"
```

Also update any other references to `ENGRAM_ROOT` in the file (search and replace `ENGRAM_ROOT` → `ENGRAM_DATA`).

- [ ] **Step 2: Fix interview.py**

This script has both code paths and data paths. Split them:

Replace lines 63-69:
```python
# Before:
ENGRAM_ROOT = Path.home() / "engram"
MANIFEST_PATH = ENGRAM_ROOT / "credentials.json"
DISCOVER_SCRIPT = ENGRAM_ROOT / "scripts" / "discover_nodes.py"

# After:
ENGRAM_CODE = Path.home() / "engram"
ENGRAM_DATA = Path.home() / ".local" / "share" / "engram"
MANIFEST_PATH = ENGRAM_DATA / "credentials.json"
DISCOVER_SCRIPT = ENGRAM_CODE / "scripts" / "discover_nodes.py"
```

Replace line 69 (.env path):
```python
# Before:
ENV_PATH = ENGRAM_ROOT / ".env"
# After:
ENV_PATH = ENGRAM_CODE / ".env"
```

Then search the rest of the file for any `ENGRAM_ROOT` references and replace with `ENGRAM_DATA` (for data paths) or `ENGRAM_CODE` (for code paths). Use judgment: if it references entities, logs, databases → `ENGRAM_DATA`. If it references scripts, src → `ENGRAM_CODE`.

- [ ] **Step 3: Fix hyperloop.py**

Replace lines 44-48:
```python
# Before:
ENGRAM_ROOT = Path.home() / "engram"
HYPERLOOP_DIR = ENGRAM_ROOT / "hyperloop"

# After:
ENGRAM_DATA = Path.home() / ".local" / "share" / "engram"
HYPERLOOP_DIR = ENGRAM_DATA / "hyperloop"
```

Also update any other `ENGRAM_ROOT` references in the file.

- [ ] **Step 4: Verify no ENGRAM_ROOT references remain in scripts/**

Run: `grep -rn "ENGRAM_ROOT" ~/engram/scripts/`
Expected: Zero matches (some shell scripts use `$ENGRAM_ROOT` for relative paths derived from script location — those are fine, they reference the code repo and resolve correctly)

- [ ] **Step 5: Commit**

```bash
cd ~/engram
git add scripts/engram_watchdog.py scripts/interview.py scripts/hyperloop.py
git commit -m "fix: update scripts to use ~/.local/share/engram for data paths"
```

---

### Task 4: Fix Hardcoded Absolute Paths

**Files:**
- Modify: `scripts/find_dupes.py:14,16,122`
- Modify: `scripts/persistent_ingest.py:46`

- [ ] **Step 1: Fix find_dupes.py**

Replace line 14:
```python
# Before:
SCAN_ROOT = Path("/home/geodesix/engram/entities")
# After:
SCAN_ROOT = Path.home() / ".local" / "share" / "engram" / "entities"
```

Replace line 16:
```python
# Before:
REPORT_PATH = Path("/home/geodesix/engram/artifacts/superpowers/dedup_report.json")
# After:
REPORT_PATH = Path.home() / "engram" / "artifacts" / "superpowers" / "dedup_report.json"
```
(Note: artifacts is code/repo output, stays in `~/engram/`)

Replace line 122:
```python
# Before:
        keep_short = str(g["keep"]).replace("/home/geodesix/engram/entities/", "")
# After:
        keep_short = str(g["keep"]).replace(str(SCAN_ROOT) + "/", "")
```

- [ ] **Step 2: Fix persistent_ingest.py**

Replace line 46:
```python
# Before:
    parser.add_argument("--source", default="/home/geodesix/entities", help="Source dir")
# After:
    parser.add_argument("--source", default=str(Path.home() / ".local" / "share" / "engram" / "entities"), help="Source dir")
```

- [ ] **Step 3: Verify no hardcoded /home/geodesix/engram paths remain**

Run: `grep -rn "/home/geodesix/engram" ~/engram/scripts/`
Expected: Zero matches

- [ ] **Step 4: Commit**

```bash
cd ~/engram
git add scripts/find_dupes.py scripts/persistent_ingest.py
git commit -m "fix: remove hardcoded absolute paths from scripts"
```

---

### Task 5: Write Migration Script

**Files:**
- Create: `scripts/migrate-data.sh`

- [ ] **Step 1: Create the migration script**

Create `scripts/migrate-data.sh`:

```bash
#!/usr/bin/env bash
set -euo pipefail

# Engram data migration: ~/engram/ → ~/.local/share/engram/
# Run once manually. Moves data files, leaves code in place.

SRC="$HOME/engram"
DST="$HOME/.local/share/engram"

RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'

echo "════════════════════════════════════════════════════════════"
echo "  Engram Data Migration"
echo "  From: $SRC"
echo "  To:   $DST"
echo "════════════════════════════════════════════════════════════"
echo ""

# Pre-flight: stop engram service if running
if systemctl is-active --quiet engram-server 2>/dev/null; then
    echo -e "${YELLOW}WARNING: engram-server is running. Stop it first:${NC}"
    echo "  sudo systemctl stop engram-server"
    echo ""
    read -p "Continue anyway? (y/N) " -r
    [[ $REPLY =~ ^[Yy]$ ]] || exit 1
fi

# Create destination
mkdir -p "$DST"

# Items to move (directories)
DIRS=(entities logs telemetry synapse_data agent_memory .encryption hyperloop engram.db)

# Items to move (files)
FILES=(vault_index.sqlite memory_store.db workers.sqlite
       librarian_state.json segregation_state.json
       PRUNED_DICTIONARY.md MASTER_PROFILE.md SELF_MODEL.md
       credentials.json)

moved=0
skipped=0
failed=0

move_item() {
    local name="$1"
    local src_path="$SRC/$name"
    local dst_path="$DST/$name"

    if [ ! -e "$src_path" ]; then
        echo -e "  ${YELLOW}SKIP${NC} $name (not found in source)"
        ((skipped++))
        return
    fi

    if [ -e "$dst_path" ]; then
        echo -e "  ${YELLOW}SKIP${NC} $name (already exists at destination)"
        ((skipped++))
        return
    fi

    mv "$src_path" "$dst_path"
    if [ -e "$dst_path" ]; then
        echo -e "  ${GREEN}MOVED${NC} $name"
        ((moved++))
    else
        echo -e "  ${RED}FAILED${NC} $name"
        ((failed++))
    fi
}

echo "Moving directories..."
for d in "${DIRS[@]}"; do
    move_item "$d"
done

echo ""
echo "Moving files..."
for f in "${FILES[@]}"; do
    move_item "$f"
done

echo ""
echo "════════════════════════════════════════════════════════════"
echo "  Results: $moved moved, $skipped skipped, $failed failed"
echo "════════════════════════════════════════════════════════════"

# Verify
echo ""
echo "Destination contents:"
ls -la "$DST/" 2>/dev/null || echo "(empty)"

echo ""
if [ -d "$DST/entities" ]; then
    entity_count=$(find "$DST/entities" -name "*.md" 2>/dev/null | wc -l)
    echo "Entities: $entity_count markdown files"
fi

if [ -f "$DST/vault_index.sqlite" ]; then
    idx_size=$(du -h "$DST/vault_index.sqlite" | cut -f1)
    echo "Index: $idx_size"
fi

echo ""
echo "Remaining data in source repo:"
for item in "${DIRS[@]}" "${FILES[@]}"; do
    [ -e "$SRC/$item" ] && echo "  ⚠ $item still in $SRC"
done

if [ $failed -eq 0 ]; then
    echo ""
    echo -e "${GREEN}Migration complete.${NC} Restart engram-server when ready:"
    echo "  sudo systemctl start engram-server"
else
    echo ""
    echo -e "${RED}Some items failed to move. Check errors above.${NC}"
    exit 1
fi
```

- [ ] **Step 2: Make executable**

Run: `chmod +x ~/engram/scripts/migrate-data.sh`

- [ ] **Step 3: Commit**

```bash
cd ~/engram
git add scripts/migrate-data.sh
git commit -m "feat: add one-time data migration script"
```

---

### Task 6: Update Documentation

**Files:**
- Modify: `CONFIGURATION.md`
- Modify: `ARCHITECTURE.md`

- [ ] **Step 1: Update CONFIGURATION.md**

Find the path reference table (around line 63) and update:

```markdown
| Variable | Default | Description |
|----------|---------|-------------|
| ENGRAM_DATA | ~/.local/share/engram/ | Runtime data directory (entities, databases, logs) |
```

Add a new section explaining the separation:

```markdown
## Data Directory

Engram separates code from data:

- **Code:** `~/engram/` — the git repository (src, scripts, tests, docs)
- **Data:** `~/.local/share/engram/` — runtime data (entities, databases, logs, telemetry)

You can safely delete and re-clone `~/engram/` without losing any data.
```

- [ ] **Step 2: Update ARCHITECTURE.md**

Find the ENGRAM_ROOT reference (around line 261) and replace with:

```markdown
| ENGRAM_DATA | ~/.local/share/engram | Runtime data root (entities, databases, logs) |
```

- [ ] **Step 3: Commit**

```bash
cd ~/engram
git add CONFIGURATION.md ARCHITECTURE.md
git commit -m "docs: update path references for data separation"
```

---

### Task 7: Verify Everything Works

- [ ] **Step 1: Run the migration script**

```bash
cd ~/engram
sudo systemctl stop engram-server
bash scripts/migrate-data.sh
```

Expected: All items show MOVED or SKIP (if not present). Zero FAILED.

- [ ] **Step 2: Verify data directory exists and has content**

```bash
ls -la ~/.local/share/engram/
ls ~/.local/share/engram/entities/ | head -5
du -sh ~/.local/share/engram/
```

Expected: Entities, databases, and logs are present at new location.

- [ ] **Step 3: Verify no data remains in repo**

```bash
ls ~/engram/entities/ 2>/dev/null && echo "FAIL: entities still in repo" || echo "OK"
ls ~/engram/vault_index.sqlite 2>/dev/null && echo "FAIL: index still in repo" || echo "OK"
ls ~/engram/logs/ 2>/dev/null && echo "FAIL: logs still in repo" || echo "OK"
```

Expected: All show "OK"

- [ ] **Step 4: Verify no ENGRAM_ROOT references remain anywhere**

```bash
grep -rn "ENGRAM_ROOT" ~/engram/src/ ~/engram/scripts/
```

Expected: Zero matches (shell scripts using `$ENGRAM_ROOT` derived from script location are acceptable)

- [ ] **Step 5: Restart engram-server and verify it works**

```bash
sudo systemctl start engram-server
sleep 2
curl -s http://localhost:8001/health | head -5
```

Expected: Server starts, health endpoint responds.

- [ ] **Step 6: Test MCP memory search**

Use engram MCP tools to search for a known memory. Verify it finds results from the new data location.

- [ ] **Step 7: Final commit if any fixups needed**

```bash
cd ~/engram
git status
# If any fixes were needed, commit them
```
