# Scout MCP Server — Design Specification

**Date:** 2026-04-03
**Status:** Approved
**Location:** ~/scout

## Overview

Scout is an MCP server that provides academic research intelligence to AI agents.
It fills a gap in the MCP ecosystem: agents can find web pages, but they cannot
reliably work with academic papers. They hallucinate citations, cannot check if
an idea already exists, and have no way to traverse citation networks.

Scout solves this with 6 focused tools built on real API infrastructure
(arXiv, Semantic Scholar, OpenAlex, CrossRef, DataCite).

**Identity:** Citation intelligence for AI agents.
**Tagline:** Your agent can find papers. Scout makes sure they're real.

## Tools

### 1. scout_search

Search academic papers across arXiv, Semantic Scholar, and OpenAlex.
Results are deduplicated, cached, and sorted by citation count.

**Input:**
- `query` (string, required) — search terms
- `limit` (int, default 20, max 100) — max results
- `year_min` (int, optional) — filter papers before this year
- `sources` (list[string], default ["arxiv", "semantic_scholar", "openalex"]) — which APIs to query

**Output:** List of papers with: title, authors, year, abstract, venue,
citation_count, doi, arxiv_id, url, source.

**Infrastructure:** 3-source search with DOI/arXiv-ID/fuzzy-title deduplication.
Circuit breakers per source. Cached results (arXiv 24h, S2 3d, OpenAlex 3d).
Rate limiting per API spec. Graceful degradation if a source is down.

### 2. scout_paper_detail

Fetch full metadata for a specific paper by DOI, arXiv ID, or Semantic Scholar ID.

**Input:**
- `paper_id` (string, required) — DOI, arXiv ID (e.g., "2301.07041"), or S2 ID
- `include_references` (bool, default false) — include papers this paper cites
- `include_citations` (bool, default false) — include papers that cite this paper

**Output:** Full paper metadata plus optional reference/citation lists.

**Infrastructure:** Resolves ID type automatically. Queries appropriate API.
Merges metadata from multiple sources when available.

### 3. scout_verify

Verify whether a list of citations are real or hallucinated. This is Scout's
killer feature — LLMs routinely fabricate academic references.

**Input:**
- `references` (list[object], required) — each with: title (required),
  authors (optional), year (optional), doi (optional), arxiv_id (optional)

**Output:** Per-reference verdict:
- `status`: "verified" | "suspicious" | "hallucinated" | "skipped"
- `confidence`: 0.0-1.0
- `method`: which verification layer succeeded
- `matched_paper`: real paper metadata if found
- Summary: total verified/suspicious/hallucinated counts.

**Infrastructure:** 4-layer verification cascade:
1. arXiv ID direct lookup (if provided)
2. CrossRef DOI lookup + DataCite fallback
3. OpenAlex title search
4. Semantic Scholar title search (last resort)

Title similarity via word-overlap Jaccard metric. Thresholds:
verified >= 0.80, suspicious 0.50-0.80, hallucinated < 0.50.

### 4. scout_novelty

Score how novel a research idea is against existing published work.

**Input:**
- `idea` (string, required) — the research idea or hypothesis
- `context` (string, optional) — additional context about the approach
- `threshold` (float, default 0.25) — similarity threshold for flagging

**Output:**
- `novelty_score`: 0.0-1.0 (higher = more novel)
- `assessment`: "high" | "moderate" | "low" | "critical"
- `recommendation`: "proceed" | "differentiate" | "pivot" | "proceed_with_caution"
- `similar_papers`: list of papers with similarity scores
- `gaps`: identified gaps in existing work (what's NOT covered)

**Infrastructure:** Keyword extraction from idea, multi-source search,
blended similarity scoring (70% keyword overlap + 30% title sequence matching).
High-citation papers weighted more heavily in risk assessment.

### 5. scout_citations

Get the citation graph for a paper — what it cites and what cites it.

**Input:**
- `paper_id` (string, required) — DOI, arXiv ID, or S2 ID
- `direction` (string, default "both") — "references" | "citations" | "both"
- `limit` (int, default 50, max 500) — max results per direction

**Output:**
- `references`: papers this paper cites
- `citations`: papers that cite this paper
- `stats`: total reference/citation counts

**Infrastructure:** Semantic Scholar batch API for efficient retrieval.
Results include full paper metadata. Sorted by citation count.

### 6. scout_bibtex

Generate properly formatted BibTeX entries from paper metadata.

**Input:**
- `papers` (list[object], required) — paper metadata (from any Scout tool)

**Output:** Valid BibTeX string with normalized cite keys (lastname2024keyword format).

**Infrastructure:** Direct conversion from Paper dataclass. No LLM involved —
pure deterministic formatting.

## Architecture

```
~/scout/
├── pyproject.toml              # Package: "scout-mcp"
├── src/scout/
│   ├── __init__.py
│   ├── server.py               # MCP server (stdio transport)
│   ├── tools.py                # Tool definitions and handlers
│   ├── config.py               # Configuration
│   ├── models.py               # Paper, Author, CitationResult dataclasses
│   ├── clients/
│   │   ├── __init__.py
│   │   ├── arxiv.py            # arXiv API client
│   │   ├── semantic_scholar.py # Semantic Scholar API client
│   │   ├── openalex.py         # OpenAlex API client
│   │   ├── crossref.py         # CrossRef/DataCite client
│   │   └── cache.py            # Shared caching layer
│   ├── search.py               # Unified search with dedup
│   ├── verify.py               # 4-layer citation verification
│   └── novelty.py              # Novelty scoring
├── tests/
│   ├── test_search.py
│   ├── test_verify.py
│   ├── test_novelty.py
│   ├── test_citations.py
│   ├── test_clients.py
│   └── test_server.py
├── README.md
├── LICENSE
├── ARCHITECTURE.md
├── API_REFERENCE.md
└── CONFIGURATION.md
```

## Design Decisions

1. **stdio transport only** — Standard MCP pattern. No HTTP server needed.
   Clients (Claude Desktop, Claude Code, Goose) connect via stdin/stdout.

2. **Zero heavy dependencies** — All API clients use stdlib only (urllib, json,
   xml.etree). Only dependencies: `mcp` SDK and `pyyaml` for config.

3. **No LLM calls** — Scout does not call any LLM. All logic is deterministic
   (search, dedup, verify, novelty scoring). The agent calling Scout IS the LLM.

4. **Caching by default** — All API responses cached to disk with TTLs.
   Reduces API load, speeds up repeated queries, works offline for cached data.

5. **Circuit breakers** — Each API client has a 3-state circuit breaker
   (CLOSED -> OPEN -> HALF_OPEN). If an API is down, Scout degrades gracefully
   instead of blocking.

6. **No authentication required for basic use** — arXiv and OpenAlex are free.
   Semantic Scholar works without a key (1 req/sec). Optional S2 API key
   for higher rate limits.

## Configuration

Environment variables:
- `SCOUT_S2_API_KEY` — Semantic Scholar API key (optional, higher rate limits)
- `SCOUT_CACHE_DIR` — Cache directory (default: ~/.cache/scout/)
- `SCOUT_CACHE_TTL` — Default cache TTL in seconds (default: 86400)

MCP client configuration (Claude Desktop example):
```json
{
  "mcpServers": {
    "scout": {
      "command": "scout-mcp",
      "env": {
        "SCOUT_S2_API_KEY": "your-key-here"
      }
    }
  }
}
```

## What Scout Does NOT Do

- **Run code** — Use E2B or code-sandbox-mcp for that.
- **Store memories** — Use basic-memory or mcp-memory-service.
- **Generate LaTeX** — Use mcp-latex-server or the agent itself.
- **Summarize papers** — The agent can read abstracts and do this.
- **Write papers** — The agent handles all writing.

Scout is infrastructure, not intelligence. The agent is the brain.

## Porting Strategy

The core infrastructure comes from AutoResearchClaw's literature module
(3,068 LOC, battle-tested). Changes needed:

1. Strip ResearchClaw-specific config, replace with Scout config
2. Rename modules to match Scout's structure
3. Add MCP tool wrappers (thin layer over existing functions)
4. Add citation graph tool (new, uses S2 batch API)
5. Write new tests for MCP protocol layer
6. Port existing unit tests for search/verify/novelty

Estimated reuse: ~80% of literature module code ports directly.
