Context engine

Last updated 5/3/2026

DataZoom Context Engine

Document ID: TECH-CTX-001 Version: 1.0 Product: DataZoom (midwestco/datazoom) Classification: Internal Engineering — AI Systems Architecture

Context Architecture Overview
Domain Profile
Knowledge Graph Structure
Prompt Engineering Patterns
Memory Management
Context Density Strategies
Agent Context Protocols
RAG Pipeline as Context Infrastructure
Cross-Document ID Reference Map

1. Context Architecture Overview

DataZoom manages AI context across three distinct layers: vector-indexed document memory (pgvector), structured relational state (PostgreSQL + Supabase), and session-scoped conversation threads stored in the conversations and messages tables. These layers are not isolated — they compose at query time into a unified context payload that is passed to the LLM.

1.1 Layer Map

┌────────────────────────────────────────────────────────────────┐
│  LAYER 1 — Session Context (Ephemeral)                        │
│  Tables: conversations, messages                               │
│  Scope: Per-user, per-organization, per-thread                │
│  Retrieval: Direct DB lookup by conversation_id               │
└────────────────────────┬───────────────────────────────────────┘
                         │ combined at query time
┌────────────────────────▼───────────────────────────────────────┐
│  LAYER 2 — Semantic Document Memory (Persistent)              │
│  Tables: document_chunks (embedding VECTOR(384))              │
│  Index: ivfflat, cosine ops, 100 lists                        │
│  Retrieval: pgvector similarity search via /api/.../query     │
└────────────────────────┬───────────────────────────────────────┘
                         │ filtered by org + document scope
┌────────────────────────▼───────────────────────────────────────┐
│  LAYER 3 — Structured Knowledge (Persistent)                  │
│  Tables: documents, timeline_events, cap_table transactions   │
│  Retrieval: SQL join, metadata filter, GIN index on parties[] │
└────────────────────────────────────────────────────────────────┘

1.2 Context Assembly Flow

When a user submits a query through the chat interface (product/app/(app)/context/page.tsx), the following sequence assembles context before any LLM call is made:

Thread resolution — The active conversation thread is identified via the thread selector (product/app/(app)/context/components/thread-selector.tsx). Prior messages in the thread constitute the short-term conversational context.
Semantic retrieval — The query is embedded using sentence-transformers (all-MiniLM-L6-v2, 384-dimensional vectors). The embedding is used to perform a cosine similarity search against document_chunks.embedding within the organization's document scope.
Structured fact injection — For queries involving dates, parties, or equity events, the timeline_events table and documents.parties GIN index augment the semantic results with structured facts. This prevents hallucination of known factual records.
Context packet construction — Retrieved chunks, structured facts, and the conversation thread are concatenated into a single context payload. The model router (product/lib/__tests__/model-router.test.ts) selects the appropriate LLM backend (Modal/Qwen2.5:32B for cloud; Ollama locally) based on query complexity and GPU availability.
LLM dispatch — The assembled packet is sent to either the Modal cloud endpoint or the local Ollama service (port 11434). The LLM proxy service runs on port 8001 inside the datazoom-base container and mediates all model calls.

1.3 Tenant Isolation in Context

Every context retrieval operation is scoped to the authenticated organization via Clerk auth. The /api/product/app/api/clerk/proxy/route.ts endpoint validates org membership before any document or chunk access. This means two tenants sharing identical document filenames will never receive cross-contaminated context — the pgvector query filters on organization_id before computing similarity.

2. Domain Profile

2.1 Product Domain

DataZoom operates in the legal and business due diligence domain. The primary use case is natural language analysis of legal documents — equity agreements, IP assignments, healthcare contracts, financial instruments — to surface facts, risks, and strategic options without requiring legal expertise from the user.

2.2 Key Entities

Entity ID	Entity Name	Storage Location	Description
ENT-001	Document	`documents` table	Root entity. Carries `document_type`, `parties[]`, `key_terms[]`, `summary`, `full_text`
ENT-002	Document Chunk	`document_chunks` table	Segmented unit of a document with a 384-dim embedding
ENT-003	Timeline Event	`timeline_events` table	Extracted date-anchored event with `impact` classification
ENT-004	Conversation Thread	`conversations` table	A named context thread scoped to an org and topic
ENT-005	Message	`messages` table	Individual turn within a conversation, user or assistant
ENT-006	Organization	Clerk + Supabase RLS	Tenant boundary; all entities are org-scoped
ENT-007	Cap Table Transaction	`cap_table/transactions`	Equity ownership record with approval workflow
ENT-008	Due Diligence Checklist	`/api/business-types/[typeKey]/checklist`	Business-type-specific document requirement list
ENT-009	Finding	`finding-handler.tsx`, `finding-view.tsx`	AI-surfaced risk or notable clause from document analysis
ENT-010	Strategic Option	`/api/advisor/strategic-options`	LLM-generated strategic recommendation with risk memo

2.3 Document Type Taxonomy

The documents.document_type field constrains classification to a controlled vocabulary defined in docs/MASTER_DOCUMENT_TYPES_CATALOG.md. Valid values observed in the schema:

equity — Equity agreements, stock purchase agreements, SAFEs
ip_assignment — Intellectual property transfer documents
financial — Financial statements, term sheets
healthcare — Healthcare-specific compliance and service agreements
agreement — General commercial contracts

Document type drives checklist selection (/api/business-types/[typeKey]/checklist/route.ts) and influences which prompt template is applied during analysis.

2.4 Domain Terminology Glossary

Term	Definition in DataZoom Context
RAG	Retrieval-Augmented Generation — the core pattern for grounding LLM answers in uploaded documents
Chunk	A semantically bounded segment of a document, stored in `document_chunks` with an embedding
Embedding	A 384-dimensional float vector produced by `all-MiniLM-L6-v2` representing semantic content
Citation	A traceable reference linking an LLM answer back to a specific chunk; enforced by `citation-system.test.ts`
Due Diligence (DD)	Structured review process using business-type-specific checklists
Finding	An AI-identified risk, gap, or notable clause surfaced during document analysis
Cap Table	Capitalization table tracking equity ownership; managed through a dedicated extraction and review pipeline
Thread	A named conversation context scoping related questions to a topic or document set
Strategic Option	An LLM-generated recommendation produced by the advisor pipeline (`/api/advisor/strategic-options`)
Risk Memo	Formal risk summary generated by `/api/advisor/risk-memo`
ivfflat	PostgreSQL index type (Inverted File Flat) used for approximate nearest-neighbor vector search

3. Knowledge Graph Structure

DataZoom's knowledge graph is implicit — encoded in relational foreign keys, GIN-indexed arrays, and vector similarity — rather than a dedicated graph database. The following describes how documents, IDs, and cross-references form a queryable knowledge structure.

3.1 Entity Relationship Graph

Organization (Clerk org_id)
    │
    ├── Document [ENT-001] (UUID)
    │       ├── document_type (controlled vocab)
    │       ├── parties[] (GIN-indexed TEXT[])
    │       ├── key_terms[] (GIN-indexed TEXT[])
    │       ├── summary (LLM-generated)
    │       │
    │       ├── Document Chunk [ENT-002] (UUID, FK → document_id)
    │       │       ├── content (TEXT)
    │       │       ├── embedding (VECTOR(384))
    │       │       └── metadata (JSONB)
    │       │
    │       └── Timeline Event [ENT-003] (UUID, FK → document_id)
    │               ├── event_date (DATE)
    │               ├── event_type (controlled vocab)
    │               ├── parties_involved (TEXT[])
    │               └── impact (critical|high|medium|low)
    │
    ├── Conversation Thread [ENT-004]
    │       └── Message [ENT-005] (ordered turns)
    │
    ├── Cap Table Transaction [ENT-007]
    │       └── Review Record (approve/reject workflow)
    │
    └── Due Diligence Checklist [ENT-008]
            └── Finding [ENT-009] (linked to document chunks)

3.2 Cross-Reference Mechanisms

Party-based cross-document linking. The documents.parties field is a GIN-indexed TEXT[]. A query like WHERE parties && ARRAY['Acme Corp'] retrieves all documents mentioning a given party, enabling cross-document analysis of a single entity's obligations and history without an explicit graph join.

Date-anchored event correlation. The timeline_events table indexed by event_date allows temporal correlation across documents. The idx_timeline_type index supports filtering by event_type, enabling queries such as "all equity changes in Q1 2024 across all uploaded documents."

Semantic proximity graph (implicit). Documents whose chunks have high cosine similarity to a query vector form an implicit neighborhood. The ivfflat index with 100 lists provides approximate retrieval across this neighborhood at query time. This is the primary mechanism for multi-document synthesis.

Checklist-to-document linkage. The /api/business-types/[typeKey]/checklist endpoint maps a typeKey (e.g., equity, healthcare) to a structured requirement list. Findings (ENT-009) produced during analysis are linked back to specific document chunks via the citation system (tested in product/lib/__tests__/citation-system.test.ts), completing the chain from checklist requirement → document evidence → cited finding.

3.3 Documentation Knowledge Graph

The docs/ directory itself constitutes a human-readable knowledge graph with structured cross-references. Key nodes:

Node	Path	Function
Documentation Index	`docs/DOCUMENTATION_INDEX.md`	Root index; entry point for all documentation
RAG System	`docs/RAG_SYSTEM.md`	Canonical reference for retrieval architecture
AI Services	`docs/AI_SERVICES.md`	Model configuration, Modal vs. Ollama routing
Master Document Types	`docs/MASTER_DOCUMENT_TYPES_CATALOG.md`	Controlled vocabulary for document classification
Action Plans	`docs/action_plans/`	Structured implementation plans with status tracking
Archive	`docs/archive/`	Superseded documents preserved for historical context

Action plans follow a numbered sequence (00_, 01_, ...) within each feature area (e.g., docs/cap-table/action_plans/, docs/activity_page/action_plans/) and cross-reference each other via START_HERE.md files that serve as context entrypoints for any agent beginning work in that domain.

4. Prompt Engineering Patterns

4.1 RAG Grounding Pattern

The foundational prompt pattern enforces that every LLM response is grounded in retrieved document content. Based on the citation enforcement work documented in docs/archive/completed_action_plans/document_refinement_v1/07_citation_enforcement_COMPLETE.md, prompts follow this structure:

SYSTEM:
You are a legal document analysis assistant. Answer questions using ONLY the 
provided document context. For every factual claim, cite the source chunk using 
[Document: {filename}, Chunk: {chunk_index}] notation. If the answer cannot be 
found in the provided context, state that explicitly — do not speculate.

CONTEXT:
[Retrieved chunks injected here — ordered by cosine similarity score]

CONVERSATION HISTORY:
[Prior messages in the active thread]

USER QUERY:
{user_question}

Citation enforcement is validated by product/lib/__tests__/citation-system.test.ts, ensuring responses that lack source references are flagged rather than surfaced to users.

4.2 Document Summary Pattern

During ingestion, each document receives an LLM-generated summary stored in documents.summary. This is a one-shot extraction prompt (not RAG), applied to the full document text:

SYSTEM:
Extract a structured summary from the following legal/business document. 
Identify: document type, all named parties, effective date, key terms, 
and 3-5 material obligations or provisions.

DOCUMENT TEXT:
{full_text}

OUTPUT FORMAT: JSON with fields: type, parties[], effective_date, key_terms[], summary_prose

The structured output populates documents.document_type, documents.parties, documents.key_terms, and documents.summary atomically.

4.3 Timeline Extraction Pattern

The timeline pipeline (idx_timeline_type, idx_timeline_date indexes) is fed by a structured extraction prompt:

SYSTEM:
You are extracting a chronological event list from a legal document.
For each event you identify, output: date (ISO 8601), event_type 
(equity_change|agreement_signed|ip_assignment|other), description, 
parties_involved[], and impact (critical|high|medium|low).

DOCUMENT:
{full_text}

Return a JSON array of events. If a date is approximate, use the first day 
of the applicable month or year.

4.4 Strategic Advisor Pattern

The advisor pipeline (/api/product/app/api/advisor/route.ts, /api/advisor/strategic-options/route.ts, /api/advisor/risk-memo/route.ts) uses a multi-stage pattern documented in docs/archive/completed_action_plans/modal_migration_optimization/03_single_stage_strategic_options.md:

STAGE 1 — Risk Identification:
Given the following document findings and timeline, identify the top risks 
facing this organization. Classify each risk by: category, severity, 
likelihood, and affected parties.

STAGE 2 — Strategic Options Generation:
For each identified risk, generate 2-3 strategic options. For each option, 
specify: action, owner, timeline, cost estimate, and expected outcome.

STAGE 3 — Risk Memo Synthesis:
Synthesize the risks and strategic options into an executive risk memo 
suitable for a board or investor audience.

The batch endpoint (/api/advisor/batch/route.ts) and queue processor (/api/advisor/process-queue/route.ts) manage parallel execution of this pipeline across multiple documents.

4.5 Due Diligence Matching Pattern

The DD match panel (product/app/(app)/[type]/[id]/views/dd-match-panel.tsx) uses a checklist-grounded prompt:

SYSTEM:
You are performing due diligence against the following checklist requirements.
For each requirement, assess: present (yes/no/partial), evidence (chunk citation), 
and gap_severity (critical|major|minor|none).

CHECKLIST:
{business_type_checklist}

DOCUMENT CONTEXT:
{retrieved_chunks}

4.6 Clause Comparison Pattern

The clause comparison endpoints (/api/clauses/compare/route.ts, /api/clauses/compare/[id]/route.ts) apply a diff-style prompt:

Compare the following two clause texts. Identify: 
(1) substantive differences in obligations or rights,
(2) missing protections present in Clause A but absent in Clause B,
(3) risk delta — which version is more favorable to {party} and why.

CLAUSE A: {clause_a_text}
CLAUSE B: {clause_b_text}

4.7 Question Templates

Pre-built question templates (product/app/(app)/context/components/question-templates.tsx) lower the prompt engineering burden for end users by providing domain-appropriate starting queries:

"What are the key obligations of [Party] under this agreement?"
"Identify all equity transfer events and their effective dates."
"Are there any non-compete clauses? What are their terms?"
"Summarize the IP assignment provisions across all uploaded documents."

These templates are not static strings — they are parameterized by the current document scope and active parties, injecting entity context before the user even types.

5. Memory Management

5.1 Session Context Window

DataZoom manages LLM context windows explicitly because the Qwen2.5:32B model (deployed via Modal) and local Ollama models have bounded context sizes. The conversation thread in product/app/(app)/context/components/conversation-panel.tsx does not naively append all prior messages. Instead, the following strategy applies:

Recent-first truncation. The context assembly includes the most recent N messages from the thread, where N is bounded by a token budget. Older messages are excluded once the budget is exceeded.

Decision log persistence. The product/app/(app)/context/components/decision-log.tsx and save-decision-form.tsx components allow users to explicitly persist key decisions from a conversation. These saved decisions are injected into future context packets as high-priority facts, surviving beyond the rolling message window. This is the primary cross-session memory mechanism.

Suggested follow-ups. After each LLM response, the product/app/(app)/context/components/suggested-followups.tsx component surfaces continuations. These are generated with awareness of the current thread context and help maintain coherent inquiry chains without requiring the user to re-establish context manually.

5.2 Cross-Session Persistence

Mechanism	Storage	Persistence	Notes
Conversation threads	`conversations` table	Indefinite	Named, retrievable by thread selector
Saved decisions	`decision_log` (inferred)	Indefinite	Explicitly preserved by user action
Document summaries	`documents.summary`	Indefinite	Generated once at ingestion; reused in all subsequent context
Timeline events	`timeline_events` table	Indefinite	Extracted once; available as structured facts in all queries
Chunk embeddings	`document_chunks.embedding`	Indefinite	Persisted vectors; no re-embedding needed per query

5.3 Summarization Strategy

DataZoom avoids re-summarizing documents on every query. The documents.summary field is populated once during the ingestion pipeline and reused. This summary, combined with key_terms[] and parties[], constitutes a compressed document representation that can be included in context without loading the full full_text field.

For long conversation threads that exceed the context window, the system relies on the Decision Log as a manual summarization mechanism rather than automatic thread compression. This trades automation for accuracy — users confirm what is worth preserving rather than trusting an automated summarizer to select salient points.

5.4 Embedding Lifecycle

Embeddings are generated by sentence-transformers (all-MiniLM-L6-v2) during document ingestion by the worker service (docker/Dockerfile.worker). The 384-dimensional vectors are stored in document_chunks.embedding and indexed via ivfflat. Re-embedding is triggered only when document content changes, controlled by the analysis regeneration endpoint (/api/analysis/regenerate/route.ts). Embedding status monitoring was historically tracked in docs/archive/check_embedding_status.sql.

6. Context Density Strategies

Context density — the ratio of useful signal to total tokens in an LLM context packet — is a first-class concern in DataZoom's architecture, given the cost and latency of operating Qwen2.5:32B via Modal.

6.1 Chunk-Level Density

Document chunks in document_chunks are sized to balance retrieval precision against context efficiency. The all-MiniLM-L6-v2 model produces meaningful embeddings for chunks of 128–512 tokens. Chunks that are too short lack semantic coherence; chunks that are too long dilute the embedding's specificity. The chunk_index integer tracks position within the document, enabling the retrieval layer to include adjacent chunks (context window expansion) when a single chunk is insufficient.

The product/lib/__tests__/rag-retrieval-enhanced.test.ts and product/lib/__tests__/rag-retrieval.test.ts test files validate that retrieval returns the highest-signal chunks for a given query, not merely the most similar ones by raw cosine distance.

6.2 Metadata Pre-filtering

Before cosine similarity search, the vector query is pre-filtered using structured metadata:

Document type filter — If the query context implies a specific document type (e.g., cap table questions filter to document_type = 'equity'), only chunks from matching documents are searched.
Party filter — Queries mentioning a specific party name can pre-filter on documents.parties using the GIN index, reducing the search space before vector comparison.
Date range filter — Timeline-anchored queries can restrict to documents with effective_date within a range.

This pre-filtering is a critical density strategy: it ensures the top-K retrieved chunks are drawn from the most relevant document subset rather than the entire corpus, preventing the context window from being filled with topically adjacent but contextually irrelevant content.

6.3 Structured Fact Injection vs. Raw Text

Where a structured fact is available (e.g., a cap table transaction record, a timeline event with explicit event_date and parties_involved), DataZoom injects the structured record rather than the raw chunk text. A structured fact like:

{
  "event_date": "2024-03-15",
  "event_type": "equity_change",
  "description": "Series A closing: 2,000,000 shares issued to Acme Ventures",
  "parties_involved": ["Acme Ventures", "DataZoom Inc."],
  "impact": "critical"
}

...consumes far fewer tokens than the equivalent clause in the original agreement, while conveying the same factual content. The LLM's task becomes synthesis and explanation rather than extraction, improving both efficiency and accuracy.

6.4 Summary-First Context

When a query is broad (e.g., "Give me an overview of all documents"), the context packet leads with documents.summary fields rather than raw chunks. This allows the LLM to synthesize across many documents without exhausting the context window on full-text retrieval. The strategic overview component (product/app/(app)/context/components/strategic-overview.tsx) specifically uses this pattern.

6.5 Model Routing for Density Efficiency

The model router (product/lib/__tests__/model-router.test.ts) selects between the cloud LLM (Modal/Qwen2.5:32B) and local Ollama based on query complexity. Simple factual queries (date lookups, party identification) are routed to lighter local models with smaller context requirements. Complex multi-document synthesis or strategic analysis is routed to the 32B cloud model. This ensures context density optimization is paired with appropriate model capacity.

7. Agent Context Protocols

DataZoom operates multiple specialized agents, each receiving a different context protocol.

7.1 Agent Inventory

Agent ID	Agent Name	Entry Point	Context Protocol
AGT-001	Document Ingestion Agent	`docker/Dockerfile.worker`	Full document text + no prior context
AGT-002	Embedding Agent	Worker service, GPU profile	Raw chunk text → vector output
AGT-003	Chat Agent	`/api/context/` (inferred)	Thread history + retrieved chunks + structured facts
AGT-004	Advisor Agent	`/api/advisor/route.ts`	Document summaries + findings + checklist gaps
AGT-005	Cap Table Extraction Agent	`/api/cap-table/extract/route.ts`	Equity document chunks + extraction schema
AGT-006	Timeline Extraction Agent	`timeline_events` pipeline	Full document text + temporal extraction prompt
AGT-007	Clause Comparison Agent	`/api/clauses/compare/route.ts`	Two clause texts + party context
AGT-008	Due Diligence Agent	`/api/business-types/[typeKey]/checklist`	Checklist + document chunks
AGT-009	Analysis/Party Agent	`/api/analysis/party/route.ts`	Party-filtered document set + analysis prompt

7.2 Context Handoff Between Agents

Ingestion → Embedding (AGT-001 → AGT-002). The ingestion worker processes a raw uploaded file into cleaned text, splits it into chunks, and writes chunk records to document_chunks with embedding = NULL. The embedding agent reads unembedded chunks and populates the vector column. This is a one-way handoff via database state.

Ingestion → Chat (AGT-001 → AGT-003). Once embeddings are populated, the chat agent can retrieve chunks via similarity search. No direct handoff occurs — the database acts as the shared memory. The documents.summary generated during ingestion is immediately available to the chat agent.

Chat → Advisor (AGT-003 → AGT-004). When a user escalates from chat to advisory analysis, the conversation thread context (key decisions, identified risks from prior chat) can be included in the advisor's context packet. The strategic input component (product/app/(app)/context/components/strategic-input.tsx) provides the UI surface for this escalation, allowing users to annotate what from the chat session is relevant to carry forward.

Cap Table Extraction → Review Workflow (AGT-005 → human). The extraction agent (/api/cap-table/extract/route.ts) outputs candidate transactions that enter a human review queue (/api/cap-table/review/route.ts). The agent's context is preserved in the candidate record — the reviewer sees the source document chunk, the extracted fields, and the agent's confidence signal. Approved records (/api/cap-table/review/[id]/approve/route.ts) become canonical cap_table/transactions entries.

7.3 Shared Context Infrastructure

All agents share the following context infrastructure:

Supabase PostgreSQL — Authoritative state store; the single source of truth for all persistent context
pgvector index — Shared semantic retrieval layer available to any agent that needs document-grounded context
Clerk organization scope — Every agent call is bound to an org_id, ensuring no cross-tenant context contamination
BullMQ + Redis (Upstash) — The async job queue (docker/cloud-worker/) coordinates agent execution order and passes job payloads (which include document IDs and context pointers) between pipeline stages

7.4 Cloud vs. Local Agent Context

The docker-compose.yml defines two GPU profiles:

full profile — Runs ollama (local LLM), llm proxy (port 8001), embed service, and reranker locally. Agents in this mode operate with lower latency but constrained model capacity.
gpu profile — Runs embed + reranker + ollama. Cloud LLM calls go to Modal's Qwen2.5:32B endpoint. The admin pipeline routing stats endpoint (/api/admin/routing-stats/route.ts) tracks which agents are dispatching to which backend.

The cloud worker (docker/cloud-worker/Dockerfile, deployed via Fly.io) handles asynchronous heavy analysis tasks (advisor pipeline, batch extraction) independently from the web process, preventing context-heavy operations from blocking the interactive chat experience.

8. RAG Pipeline as Context Infrastructure

The RAG system is the central context infrastructure for DataZoom. docs/RAG_SYSTEM.md is the canonical reference. The following summarizes the pipeline's role as context management:

8.1 Ingestion Phase (Context Construction)

Upload → Storage (Supabase Storage)
      → Text Extraction (Python worker)
      → Chunking (fixed-size with overlap)
      → Embedding (all-MiniLM-L6-v2 → 384-dim)
      → document_chunks INSERT (content + embedding + metadata JSONB)
      → documents.summary UPDATE (LLM extraction)
      → timeline_events INSERT (temporal extraction)

Each step builds a persistent context artifact. The metadata JSONB column on document_chunks carries per-chunk signals (page number, section header, chunk type) that the retrieval layer uses for re-ranking.

8.2 Retrieval Phase (Context Assembly)

User Query
    → Embed query (all-MiniLM-L6-v2)
    → Pre-filter (document_type, parties[], effective_date)
    → pgvector cosine similarity search (ivfflat, top-K)
    → Re-ranking (if reranker service active)
    → Structured fact augmentation (timeline_events JOIN)
    → Thread history prepend
    → Context packet → LLM

The product/lib/__tests__/rag-retrieval-enhanced.test.ts test covers the full retrieval-plus-reranking path. The standard rag-retrieval.test.ts covers the base similarity search path.

8.3 Reranker Service

The GPU profile in docker-compose.yml includes a reranker service. Reranking applies a cross-encoder model to the top-K cosine-similar chunks, reordering them by relevance to the specific query (not just semantic proximity). This is a high-value context density mechanism: it promotes the most answer-relevant chunks to the top of the context packet, where LLMs tend to pay more attention.

8.4 Citation System

Every chunk included in a context packet carries a traceable identity: document_id (UUID) → documents.filename and chunk_index. The LLM is instructed to cite using this identity. The citation system (product/lib/__tests__/citation-system.test.ts) validates that:

LLM responses include citation markers for factual claims
Citation markers resolve to real chunks in the database
Chunks cited exist in the organization's document scope

This closes the context loop: users can trace every AI-generated claim back to its source document and chunk, making the context chain auditable.

9. Cross-Document ID Reference Map

The following table maps every major system component to its ID in the DataZoom taxonomy, enabling cross-reference from any documentation node.

9.1 Business Requirements

ID	Requirement	Owner Component
BR-001	Multi-tenant context isolation	Clerk auth + Supabase RLS
BR-002	Every AI response must be source-cited	Citation system (`citation-system.test.ts`)
BR-003	Context must persist across sessions	`conversations`, `decision_log` tables
BR-004	Document analysis must support 5 document types	`documents.document_type` controlled vocab
BR-005	Cap table data requires human review before persistence	Review workflow (`/api/cap-table/review/`)
BR-006	LLM responses must be grounded in uploaded documents only	RAG retrieval + system prompt constraint

9.2 Technical Components

ID	Component	Path
TECH-001	Vector similarity index	`CREATE INDEX ON document_chunks USING ivfflat (embedding vector_cosine_ops)`
TECH-002	Embedding model	`sentence-transformers/all-MiniLM-L6-v2` (384 dimensions)
TECH-003	Primary LLM (cloud)	Modal endpoint, `Qwen2.5:32B`
TECH-004	Primary LLM (local)	Ollama, port `11434`, `docker-compose.yml`
TECH-005	LLM proxy service	Port `8001`, `ghcr.io/midwestco/datazoom-base:latest`
TECH-006	Job queue	BullMQ + Redis (Upstash), TLS via `ssl_cert_reqs=