# Enterprise Hardening Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Add Pydantic validation, observability metrics, advanced test coverage, and a performance benchmark to elevate Engram to enterprise-grade.

**Architecture:** Pydantic models validate MCP tool inputs at the boundary. A timing decorator on `call_tool` emits structured log metrics. Tests expand to cover malformed inputs, concurrent requests, and guardrail edge cases. A standalone benchmark script measures retrieval latency at scale.

**Tech Stack:** Python 3.10+, pydantic v2, pytest, pytest-asyncio, time.perf_counter

---

## Task 1: Pydantic models for MCP tool input validation

**Files:**
- Create: `src/engram/models.py`
- Modify: `src/engram/server.py:60-89`
- Modify: `pyproject.toml` (add pydantic dep)

## Task 2: Observability — tool execution timing

**Files:**
- Modify: `src/engram/server.py` (add timing to call_tool)

## Task 3: Advanced test coverage

**Files:**
- Modify: `tests/test_server.py` (add Pydantic validation tests, concurrent requests, malformed UTF-8)
- Modify: `tests/test_guardrail.py` (add malformed YAML, empty file, empty tags, PRUNED_DICTIONARY accumulation)

## Task 4: Performance benchmark script

**Files:**
- Create: `scripts/benchmark.py`
