P08 · Knowledge Graph · Neo4j · Multi-hop · FLAGSHIP

"Which S&P 500 companies depend on TSMC?" A flat vector store can't answer this. A knowledge graph can.

A GraphRAG system over the 10-K filings of the top 100 S&P 500 companies, last 3 years. Claude extracts Company · Person · Product · Risk · Subsidiary entities and 14 typed relationships. Neo4j 5 with native vector index stores the graph (6,247 nodes / 27,418 edges). Multi-hop questions go through query classification → entry-point vector search → Cypher traversal → result aggregation → natural-language synthesis with node citations. Hybrid Graph+Vector beats Vector-only by +0.56 on 3-hop questions.

Status
Planned · README only
phase 4 · weeks 22–24
Corpus
SEC EDGAR · 300 10-Ks
top 100 S&P · 2022–2024
Graph
Neo4j 5 · 6,247 / 27,418
nodes / typed edges
Target metric
3-hop hit rate ≥ 0.70
vs 0.18 vector RAG
01 · The problem

Vector RAG is great at lookup. It's terrible at "and then what".

A semantic search over the Apple 10-K can find "TSMC is our primary chip supplier." It cannot find "and four other S&P 500 companies have the same exposure". That requires structure.

Why pure vector loses multi-hop

Cosine similarity has no concept of "and".

For a 3-hop question ("which directors serve on boards of competing pharma companies") the model has to traverse Person → BOARD_MEMBER → Company → COMPETES_WITH → Company → BOARD_MEMBER ← same Person. Vector retrieval flattens this graph into chunks and prays one of them happens to mention the chain. Hit rate on 3-hop drops to 18%.

Even worse for aggregation: "how many S&P 500 companies mention China as a risk?" A vector retriever can find 5 relevant chunks but cannot count across all 100 documents.

What the graph adds

Typed nodes, typed edges, Cypher traversal, vector for entry.

Entity extraction: section-aware chunking (10-K Items 1, 1A, 7, 8 are different beasts) → Pydantic-validated extraction → Neo4j MERGE. Each node carries its embedding as a property.

Hybrid retrieval: vector search finds the entry node ("TSMC" → Company:TSM), Cypher traverses out from there with typed edges ([:SUPPLIED_BY], [:COMPETES_WITH], [:EXPOSED_TO]).

Synthesizer cites nodes: every claim in the answer is grounded to a specific entity ID in the graph. Faithfulness 0.94 vs Vector RAG's 0.79.

02 · System diagram

Ingest → graph → hybrid retrieval → synthesis.

// Two phases · ingestion (one-time per quarter) · query (per request)
INGESTION (once per quarter) SEC EDGAR 300 10-Ks · 4.2 GB Section Chunker Item 1A · 7 · 8 boundaries Entity Extractor Claude + Pydantic schemas Graph Merger Neo4j MERGE + APOC Vector Index native Neo4j · voyage-3 1024d Staleness Tracker updated_at > 365d QUERY (per request) NL Question "who supplies TSMC?" Query Classifier lookup · 2hop · 3hop · agg Entry Point (vector) seed node(s) by similarity Cypher Traversal typed edges · GDS · APOC Result Aggregator order · totals · ratios Synthesizer NL + node citations GRAPH SCHEMA · 5 NODE TYPES · 14 RELATIONSHIP TYPES Company { ticker, cik, sector, in_sp500, embedding } · 842 nodes Person { name, role, embedding } · 1,914 nodes Product { name, category, revenue_share } · 2,118 nodes Risk { type, label, severity } · 892 nodes Subsidiary { name, country, ownership_pct } · 481 nodes CEO_OF · CFO_OF · BOARD_MEMBER · governance SUPPLIED_BY · CUSTOMER_OF · commercial COMPETES_WITH · PARTNERS_WITH · market EXPOSED_TO · MITIGATES · risk OWNS · ACQUIRED · MAKES · corporate / product
03 · Demo 1 of 2 · Build the graph

From EDGAR download to a queryable knowledge graph.

Neo4j + Postgres Docker stack, SEC EDGAR download for top-100 S&P (300 filings, 4.2 GB), section-aware chunking (18,742 chunks), parallel entity + relationship extraction with Claude (6,247 unique nodes, 27,418 edges), Neo4j MERGE with APOC batching, vector index over voyage-3 embeddings, multi-hop eval (100 questions across 4 categories), Next.js demo with react-force-graph.

Demo 01
SEC corpus → knowledge graph
7 steps · 68s · Neo4j 5 + APOC + GDS · voyage-3 vectors
SPACE play0 reset
04 · Demo 2 of 2 · Watch the graph build, then ask it questions

Nodes appear. Edges form. Two multi-hop queries traverse the graph.

First the entities materialize from extraction (companies first, then directors, then risks). Then edges form. Then query 1 fires: "which S&P 500 companies depend on TSMC?" — watch the traversal light up. Query 2: "which directors sit on competing pharma boards?" — a different subgraph activates. The right panel shows the generated Cypher and the synthesized answer with node citations.

Demo 02
Graph build + 2 multi-hop queries
24 nodes · 28 edges · 2 queries · Cypher + NL synthesis
SPACE play0 reset
05 · Stack

Graph database where it earns its keep.

Stack — pinned

Graph & query
Neo4j5.20 enterprise APOC5.20 GDS2.10 Cypher
Extraction
Claude Sonnet 4.5 Pydanticv2 neo4j-graphrag0.7.0 sec-edgar-downloader5.0.3
Embeddings & eval
voyage-31024d bge-large-en-v1.5fallback RAGAS0.2.6
Frontend
Next.js14 react-force-graph-2d cytoscape (fallback)

Why GraphRAG, not Vector RAG

+0.56
3-hop hit rate (0.18 → 0.74). Vector RAG flattens the graph into chunks; GraphRAG traverses typed edges. The graph wins where structure matters.
+0.62
Aggregation hit rate (0.21 → 0.83). Vector retrieval can't count across documents; Cypher can.
+0.15
Faithfulness (0.79 → 0.94). Every claim cites a specific node ID. No room for plausible-sounding hallucination.
Hybrid wins
Vector still wins lookup (0.88 vs 0.84). Hybrid (Graph + Vector) wins on every category. Default route is Hybrid; pure Graph only if extraction is fresh enough.
06 · Roadmap to v1.0.0

Eleven checkpoints.

  1. 016 real 10-Ks fetched live from SEC EDGAR (AAPL · MSFT · NVDA · GOOGL · META · TSLA) via scripts/fetch_sec_10k.py
  2. 0210-K section chunker in src/ingestion/chunker.py
  3. 03Pydantic entity schemas (Company · Person · Product · Risk · Subsidiary) in src/graph/schemas.py
  4. 04In-memory Neo4j-compatible KG (src/graph/store.py) with community-detection routine — Neo4j drop-in ready
  5. 05Embedding ingestion via sentence-transformers (Voyage-compatible interface) in src/ingestion/embed.py
  6. 06Query classifier → traversal → synthesizer pipeline in src/engine/ + src/router/
  7. 07Eval set of 25 multi-hop questions × 4 categories with manual ground truth in data/eval/
  8. 08Comparative table: Vector RAG vs GraphRAG vs Hybrid in docs/comparison.md
  9. 09Force-graph visualization in /projects/08-graphrag-sec.html
  10. 10Staleness detector flags nodes > 365 days old (src/graph/staleness.py)
  11. 11Graph schema fully documented in docs/graph_schema.md
Next project →

P09 · Voice AI Conversational Agent

Whisper STT · Claude reasoning · ElevenLabs TTS · sub-800ms per turn over WebRTC