Context engine
Context engine
Last updated 5/3/2026
DataZoom Context Engine
Document ID: TECH-CTX-001 Version: 1.0 Product: DataZoom (midwestco/datazoom) Classification: Internal Engineering — AI Systems Architecture
Table of Contents
- Context Architecture Overview
- Domain Profile
- Knowledge Graph Structure
- Prompt Engineering Patterns
- Memory Management
- Context Density Strategies
- Agent Context Protocols
- RAG Pipeline as Context Infrastructure
- Cross-Document ID Reference Map
1. Context Architecture Overview
DataZoom manages AI context across three distinct layers: vector-indexed document memory (pgvector), structured relational state (PostgreSQL + Supabase), and session-scoped conversation threads stored in the conversations and messages tables. These layers are not isolated — they compose at query time into a unified context payload that is passed to the LLM.
1.1 Layer Map
┌────────────────────────────────────────────────────────────────┐
│ LAYER 1 — Session Context (Ephemeral) │
│ Tables: conversations, messages │
│ Scope: Per-user, per-organization, per-thread │
│ Retrieval: Direct DB lookup by conversation_id │
└────────────────────────┬───────────────────────────────────────┘
│ combined at query time
┌────────────────────────▼───────────────────────────────────────┐
│ LAYER 2 — Semantic Document Memory (Persistent) │
│ Tables: document_chunks (embedding VECTOR(384)) │
│ Index: ivfflat, cosine ops, 100 lists │
│ Retrieval: pgvector similarity search via /api/.../query │
└────────────────────────┬───────────────────────────────────────┘
│ filtered by org + document scope
┌────────────────────────▼───────────────────────────────────────┐
│ LAYER 3 — Structured Knowledge (Persistent) │
│ Tables: documents, timeline_events, cap_table transactions │
│ Retrieval: SQL join, metadata filter, GIN index on parties[] │
└────────────────────────────────────────────────────────────────┘
1.2 Context Assembly Flow
When a user submits a query through the chat interface (product/app/(app)/context/page.tsx), the following sequence assembles context before any LLM call is made:
-
Thread resolution — The active conversation thread is identified via the thread selector (
product/app/(app)/context/components/thread-selector.tsx). Prior messages in the thread constitute the short-term conversational context. -
Semantic retrieval — The query is embedded using
sentence-transformers(all-MiniLM-L6-v2, 384-dimensional vectors). The embedding is used to perform a cosine similarity search againstdocument_chunks.embeddingwithin the organization's document scope. -
Structured fact injection — For queries involving dates, parties, or equity events, the
timeline_eventstable anddocuments.partiesGIN index augment the semantic results with structured facts. This prevents hallucination of known factual records. -
Context packet construction — Retrieved chunks, structured facts, and the conversation thread are concatenated into a single context payload. The model router (
product/lib/__tests__/model-router.test.ts) selects the appropriate LLM backend (Modal/Qwen2.5:32B for cloud; Ollama locally) based on query complexity and GPU availability. -
LLM dispatch — The assembled packet is sent to either the Modal cloud endpoint or the local Ollama service (port
11434). The LLM proxy service runs on port8001inside thedatazoom-basecontainer and mediates all model calls.
1.3 Tenant Isolation in Context
Every context retrieval operation is scoped to the authenticated organization via Clerk auth. The /api/product/app/api/clerk/proxy/route.ts endpoint validates org membership before any document or chunk access. This means two tenants sharing identical document filenames will never receive cross-contaminated context — the pgvector query filters on organization_id before computing similarity.
2. Domain Profile
2.1 Product Domain
DataZoom operates in the legal and business due diligence domain. The primary use case is natural language analysis of legal documents — equity agreements, IP assignments, healthcare contracts, financial instruments — to surface facts, risks, and strategic options without requiring legal expertise from the user.
2.2 Key Entities
| Entity ID | Entity Name | Storage Location | Description |
|---|---|---|---|
| ENT-001 | Document | documents table | Root entity. Carries document_type, parties[], key_terms[], summary, full_text |
| ENT-002 | Document Chunk | document_chunks table | Segmented unit of a document with a 384-dim embedding |
| ENT-003 | Timeline Event | timeline_events table | Extracted date-anchored event with impact classification |
| ENT-004 | Conversation Thread | conversations table | A named context thread scoped to an org and topic |
| ENT-005 | Message | messages table | Individual turn within a conversation, user or assistant |
| ENT-006 | Organization | Clerk + Supabase RLS | Tenant boundary; all entities are org-scoped |
| ENT-007 | Cap Table Transaction | cap_table/transactions | Equity ownership record with approval workflow |
| ENT-008 | Due Diligence Checklist | /api/business-types/[typeKey]/checklist | Business-type-specific document requirement list |
| ENT-009 | Finding | finding-handler.tsx, finding-view.tsx | AI-surfaced risk or notable clause from document analysis |
| ENT-010 | Strategic Option | /api/advisor/strategic-options | LLM-generated strategic recommendation with risk memo |
2.3 Document Type Taxonomy
The documents.document_type field constrains classification to a controlled vocabulary defined in docs/MASTER_DOCUMENT_TYPES_CATALOG.md. Valid values observed in the schema:
equity— Equity agreements, stock purchase agreements, SAFEsip_assignment— Intellectual property transfer documentsfinancial— Financial statements, term sheetshealthcare— Healthcare-specific compliance and service agreementsagreement— General commercial contracts
Document type drives checklist selection (/api/business-types/[typeKey]/checklist/route.ts) and influences which prompt template is applied during analysis.
2.4 Domain Terminology Glossary
| Term | Definition in DataZoom Context |
|---|---|
| RAG | Retrieval-Augmented Generation — the core pattern for grounding LLM answers in uploaded documents |
| Chunk | A semantically bounded segment of a document, stored in document_chunks with an embedding |
| Embedding | A 384-dimensional float vector produced by all-MiniLM-L6-v2 representing semantic content |
| Citation | A traceable reference linking an LLM answer back to a specific chunk; enforced by citation-system.test.ts |
| Due Diligence (DD) | Structured review process using business-type-specific checklists |
| Finding | An AI-identified risk, gap, or notable clause surfaced during document analysis |
| Cap Table | Capitalization table tracking equity ownership; managed through a dedicated extraction and review pipeline |
| Thread | A named conversation context scoping related questions to a topic or document set |
| Strategic Option | An LLM-generated recommendation produced by the advisor pipeline (/api/advisor/strategic-options) |
| Risk Memo | Formal risk summary generated by /api/advisor/risk-memo |
| ivfflat | PostgreSQL index type (Inverted File Flat) used for approximate nearest-neighbor vector search |
3. Knowledge Graph Structure
DataZoom's knowledge graph is implicit — encoded in relational foreign keys, GIN-indexed arrays, and vector similarity — rather than a dedicated graph database. The following describes how documents, IDs, and cross-references form a queryable knowledge structure.
3.1 Entity Relationship Graph
Organization (Clerk org_id)
│
├── Document [ENT-001] (UUID)
│ ├── document_type (controlled vocab)
│ ├── parties[] (GIN-indexed TEXT[])
│ ├── key_terms[] (GIN-indexed TEXT[])
│ ├── summary (LLM-generated)
│ │
│ ├── Document Chunk [ENT-002] (UUID, FK → document_id)
│ │ ├── content (TEXT)
│ │ ├── embedding (VECTOR(384))
│ │ └── metadata (JSONB)
│ │
│ └── Timeline Event [ENT-003] (UUID, FK → document_id)
│ ├── event_date (DATE)
│ ├── event_type (controlled vocab)
│ ├── parties_involved (TEXT[])
│ └── impact (critical|high|medium|low)
│
├── Conversation Thread [ENT-004]
│ └── Message [ENT-005] (ordered turns)
│
├── Cap Table Transaction [ENT-007]
│ └── Review Record (approve/reject workflow)
│
└── Due Diligence Checklist [ENT-008]
└── Finding [ENT-009] (linked to document chunks)
3.2 Cross-Reference Mechanisms
Party-based cross-document linking. The documents.parties field is a GIN-indexed TEXT[]. A query like WHERE parties && ARRAY['Acme Corp'] retrieves all documents mentioning a given party, enabling cross-document analysis of a single entity's obligations and history without an explicit graph join.
Date-anchored event correlation. The timeline_events table indexed by event_date allows temporal correlation across documents. The idx_timeline_type index supports filtering by event_type, enabling queries such as "all equity changes in Q1 2024 across all uploaded documents."
Semantic proximity graph (implicit). Documents whose chunks have high cosine similarity to a query vector form an implicit neighborhood. The ivfflat index with 100 lists provides approximate retrieval across this neighborhood at query time. This is the primary mechanism for multi-document synthesis.
Checklist-to-document linkage. The /api/business-types/[typeKey]/checklist endpoint maps a typeKey (e.g., equity, healthcare) to a structured requirement list. Findings (ENT-009) produced during analysis are linked back to specific document chunks via the citation system (tested in product/lib/__tests__/citation-system.test.ts), completing the chain from checklist requirement → document evidence → cited finding.
3.3 Documentation Knowledge Graph
The docs/ directory itself constitutes a human-readable knowledge graph with structured cross-references. Key nodes:
| Node | Path | Function |
|---|---|---|
| Documentation Index | docs/DOCUMENTATION_INDEX.md | Root index; entry point for all documentation |
| RAG System | docs/RAG_SYSTEM.md | Canonical reference for retrieval architecture |
| AI Services | docs/AI_SERVICES.md | Model configuration, Modal vs. Ollama routing |
| Master Document Types | docs/MASTER_DOCUMENT_TYPES_CATALOG.md | Controlled vocabulary for document classification |
| Action Plans | docs/action_plans/ | Structured implementation plans with status tracking |
| Archive | docs/archive/ | Superseded documents preserved for historical context |
Action plans follow a numbered sequence (00_, 01_, ...) within each feature area (e.g., docs/cap-table/action_plans/, docs/activity_page/action_plans/) and cross-reference each other via START_HERE.md files that serve as context entrypoints for any agent beginning work in that domain.
4. Prompt Engineering Patterns
4.1 RAG Grounding Pattern
The foundational prompt pattern enforces that every LLM response is grounded in retrieved document content. Based on the citation enforcement work documented in docs/archive/completed_action_plans/document_refinement_v1/07_citation_enforcement_COMPLETE.md, prompts follow this structure:
SYSTEM:
You are a legal document analysis assistant. Answer questions using ONLY the
provided document context. For every factual claim, cite the source chunk using
[Document: {filename}, Chunk: {chunk_index}] notation. If the answer cannot be
found in the provided context, state that explicitly — do not speculate.
CONTEXT:
[Retrieved chunks injected here — ordered by cosine similarity score]
CONVERSATION HISTORY:
[Prior messages in the active thread]
USER QUERY:
{user_question}
Citation enforcement is validated by product/lib/__tests__/citation-system.test.ts, ensuring responses that lack source references are flagged rather than surfaced to users.
4.2 Document Summary Pattern
During ingestion, each document receives an LLM-generated summary stored in documents.summary. This is a one-shot extraction prompt (not RAG), applied to the full document text:
SYSTEM:
Extract a structured summary from the following legal/business document.
Identify: document type, all named parties, effective date, key terms,
and 3-5 material obligations or provisions.
DOCUMENT TEXT:
{full_text}
OUTPUT FORMAT: JSON with fields: type, parties[], effective_date, key_terms[], summary_prose
The structured output populates documents.document_type, documents.parties, documents.key_terms, and documents.summary atomically.
4.3 Timeline Extraction Pattern
The timeline pipeline (idx_timeline_type, idx_timeline_date indexes) is fed by a structured extraction prompt:
SYSTEM:
You are extracting a chronological event list from a legal document.
For each event you identify, output: date (ISO 8601), event_type
(equity_change|agreement_signed|ip_assignment|other), description,
parties_involved[], and impact (critical|high|medium|low).
DOCUMENT:
{full_text}
Return a JSON array of events. If a date is approximate, use the first day
of the applicable month or year.
4.4 Strategic Advisor Pattern
The advisor pipeline (/api/product/app/api/advisor/route.ts, /api/advisor/strategic-options/route.ts, /api/advisor/risk-memo/route.ts) uses a multi-stage pattern documented in docs/archive/completed_action_plans/modal_migration_optimization/03_single_stage_strategic_options.md:
STAGE 1 — Risk Identification:
Given the following document findings and timeline, identify the top risks
facing this organization. Classify each risk by: category, severity,
likelihood, and affected parties.
STAGE 2 — Strategic Options Generation:
For each identified risk, generate 2-3 strategic options. For each option,
specify: action, owner, timeline, cost estimate, and expected outcome.
STAGE 3 — Risk Memo Synthesis:
Synthesize the risks and strategic options into an executive risk memo
suitable for a board or investor audience.
The batch endpoint (/api/advisor/batch/route.ts) and queue processor (/api/advisor/process-queue/route.ts) manage parallel execution of this pipeline across multiple documents.
4.5 Due Diligence Matching Pattern
The DD match panel (product/app/(app)/[type]/[id]/views/dd-match-panel.tsx) uses a checklist-grounded prompt:
SYSTEM:
You are performing due diligence against the following checklist requirements.
For each requirement, assess: present (yes/no/partial), evidence (chunk citation),
and gap_severity (critical|major|minor|none).
CHECKLIST:
{business_type_checklist}
DOCUMENT CONTEXT:
{retrieved_chunks}
4.6 Clause Comparison Pattern
The clause comparison endpoints (/api/clauses/compare/route.ts, /api/clauses/compare/[id]/route.ts) apply a diff-style prompt:
Compare the following two clause texts. Identify:
(1) substantive differences in obligations or rights,
(2) missing protections present in Clause A but absent in Clause B,
(3) risk delta — which version is more favorable to {party} and why.
CLAUSE A: {clause_a_text}
CLAUSE B: {clause_b_text}
4.7 Question Templates
Pre-built question templates (product/app/(app)/context/components/question-templates.tsx) lower the prompt engineering burden for end users by providing domain-appropriate starting queries:
- "What are the key obligations of [Party] under this agreement?"
- "Identify all equity transfer events and their effective dates."
- "Are there any non-compete clauses? What are their terms?"
- "Summarize the IP assignment provisions across all uploaded documents."
These templates are not static strings — they are parameterized by the current document scope and active parties, injecting entity context before the user even types.
5. Memory Management
5.1 Session Context Window
DataZoom manages LLM context windows explicitly because the Qwen2.5:32B model (deployed via Modal) and local Ollama models have bounded context sizes. The conversation thread in product/app/(app)/context/components/conversation-panel.tsx does not naively append all prior messages. Instead, the following strategy applies:
Recent-first truncation. The context assembly includes the most recent N messages from the thread, where N is bounded by a token budget. Older messages are excluded once the budget is exceeded.
Decision log persistence. The product/app/(app)/context/components/decision-log.tsx and save-decision-form.tsx components allow users to explicitly persist key decisions from a conversation. These saved decisions are injected into future context packets as high-priority facts, surviving beyond the rolling message window. This is the primary cross-session memory mechanism.
Suggested follow-ups. After each LLM response, the product/app/(app)/context/components/suggested-followups.tsx component surfaces continuations. These are generated with awareness of the current thread context and help maintain coherent inquiry chains without requiring the user to re-establish context manually.
5.2 Cross-Session Persistence
| Mechanism | Storage | Persistence | Notes |
|---|---|---|---|
| Conversation threads | conversations table | Indefinite | Named, retrievable by thread selector |
| Saved decisions | decision_log (inferred) | Indefinite | Explicitly preserved by user action |
| Document summaries | documents.summary | Indefinite | Generated once at ingestion; reused in all subsequent context |
| Timeline events | timeline_events table | Indefinite | Extracted once; available as structured facts in all queries |
| Chunk embeddings | document_chunks.embedding | Indefinite | Persisted vectors; no re-embedding needed per query |
5.3 Summarization Strategy
DataZoom avoids re-summarizing documents on every query. The documents.summary field is populated once during the ingestion pipeline and reused. This summary, combined with key_terms[] and parties[], constitutes a compressed document representation that can be included in context without loading the full full_text field.
For long conversation threads that exceed the context window, the system relies on the Decision Log as a manual summarization mechanism rather than automatic thread compression. This trades automation for accuracy — users confirm what is worth preserving rather than trusting an automated summarizer to select salient points.
5.4 Embedding Lifecycle
Embeddings are generated by sentence-transformers (all-MiniLM-L6-v2) during document ingestion by the worker service (docker/Dockerfile.worker). The 384-dimensional vectors are stored in document_chunks.embedding and indexed via ivfflat. Re-embedding is triggered only when document content changes, controlled by the analysis regeneration endpoint (/api/analysis/regenerate/route.ts). Embedding status monitoring was historically tracked in docs/archive/check_embedding_status.sql.
6. Context Density Strategies
Context density — the ratio of useful signal to total tokens in an LLM context packet — is a first-class concern in DataZoom's architecture, given the cost and latency of operating Qwen2.5:32B via Modal.
6.1 Chunk-Level Density
Document chunks in document_chunks are sized to balance retrieval precision against context efficiency. The all-MiniLM-L6-v2 model produces meaningful embeddings for chunks of 128–512 tokens. Chunks that are too short lack semantic coherence; chunks that are too long dilute the embedding's specificity. The chunk_index integer tracks position within the document, enabling the retrieval layer to include adjacent chunks (context window expansion) when a single chunk is insufficient.
The product/lib/__tests__/rag-retrieval-enhanced.test.ts and product/lib/__tests__/rag-retrieval.test.ts test files validate that retrieval returns the highest-signal chunks for a given query, not merely the most similar ones by raw cosine distance.
6.2 Metadata Pre-filtering
Before cosine similarity search, the vector query is pre-filtered using structured metadata:
- Document type filter — If the query context implies a specific document type (e.g., cap table questions filter to
document_type = 'equity'), only chunks from matching documents are searched. - Party filter — Queries mentioning a specific party name can pre-filter on
documents.partiesusing the GIN index, reducing the search space before vector comparison. - Date range filter — Timeline-anchored queries can restrict to documents with
effective_datewithin a range.
This pre-filtering is a critical density strategy: it ensures the top-K retrieved chunks are drawn from the most relevant document subset rather than the entire corpus, preventing the context window from being filled with topically adjacent but contextually irrelevant content.
6.3 Structured Fact Injection vs. Raw Text
Where a structured fact is available (e.g., a cap table transaction record, a timeline event with explicit event_date and parties_involved), DataZoom injects the structured record rather than the raw chunk text. A structured fact like:
{
"event_date": "2024-03-15",
"event_type": "equity_change",
"description": "Series A closing: 2,000,000 shares issued to Acme Ventures",
"parties_involved": ["Acme Ventures", "DataZoom Inc."],
"impact": "critical"
}
...consumes far fewer tokens than the equivalent clause in the original agreement, while conveying the same factual content. The LLM's task becomes synthesis and explanation rather than extraction, improving both efficiency and accuracy.
6.4 Summary-First Context
When a query is broad (e.g., "Give me an overview of all documents"), the context packet leads with documents.summary fields rather than raw chunks. This allows the LLM to synthesize across many documents without exhausting the context window on full-text retrieval. The strategic overview component (product/app/(app)/context/components/strategic-overview.tsx) specifically uses this pattern.
6.5 Model Routing for Density Efficiency
The model router (product/lib/__tests__/model-router.test.ts) selects between the cloud LLM (Modal/Qwen2.5:32B) and local Ollama based on query complexity. Simple factual queries (date lookups, party identification) are routed to lighter local models with smaller context requirements. Complex multi-document synthesis or strategic analysis is routed to the 32B cloud model. This ensures context density optimization is paired with appropriate model capacity.
7. Agent Context Protocols
DataZoom operates multiple specialized agents, each receiving a different context protocol.
7.1 Agent Inventory
| Agent ID | Agent Name | Entry Point | Context Protocol |
|---|---|---|---|
| AGT-001 | Document Ingestion Agent | docker/Dockerfile.worker | Full document text + no prior context |
| AGT-002 | Embedding Agent | Worker service, GPU profile | Raw chunk text → vector output |
| AGT-003 | Chat Agent | /api/context/ (inferred) | Thread history + retrieved chunks + structured facts |
| AGT-004 | Advisor Agent | /api/advisor/route.ts | Document summaries + findings + checklist gaps |
| AGT-005 | Cap Table Extraction Agent | /api/cap-table/extract/route.ts | Equity document chunks + extraction schema |
| AGT-006 | Timeline Extraction Agent | timeline_events pipeline | Full document text + temporal extraction prompt |
| AGT-007 | Clause Comparison Agent | /api/clauses/compare/route.ts | Two clause texts + party context |
| AGT-008 | Due Diligence Agent | /api/business-types/[typeKey]/checklist | Checklist + document chunks |
| AGT-009 | Analysis/Party Agent | /api/analysis/party/route.ts | Party-filtered document set + analysis prompt |
7.2 Context Handoff Between Agents
Ingestion → Embedding (AGT-001 → AGT-002). The ingestion worker processes a raw uploaded file into cleaned text, splits it into chunks, and writes chunk records to document_chunks with embedding = NULL. The embedding agent reads unembedded chunks and populates the vector column. This is a one-way handoff via database state.
Ingestion → Chat (AGT-001 → AGT-003). Once embeddings are populated, the chat agent can retrieve chunks via similarity search. No direct handoff occurs — the database acts as the shared memory. The documents.summary generated during ingestion is immediately available to the chat agent.
Chat → Advisor (AGT-003 → AGT-004). When a user escalates from chat to advisory analysis, the conversation thread context (key decisions, identified risks from prior chat) can be included in the advisor's context packet. The strategic input component (product/app/(app)/context/components/strategic-input.tsx) provides the UI surface for this escalation, allowing users to annotate what from the chat session is relevant to carry forward.
Cap Table Extraction → Review Workflow (AGT-005 → human). The extraction agent (/api/cap-table/extract/route.ts) outputs candidate transactions that enter a human review queue (/api/cap-table/review/route.ts). The agent's context is preserved in the candidate record — the reviewer sees the source document chunk, the extracted fields, and the agent's confidence signal. Approved records (/api/cap-table/review/[id]/approve/route.ts) become canonical cap_table/transactions entries.
7.3 Shared Context Infrastructure
All agents share the following context infrastructure:
- Supabase PostgreSQL — Authoritative state store; the single source of truth for all persistent context
- pgvector index — Shared semantic retrieval layer available to any agent that needs document-grounded context
- Clerk organization scope — Every agent call is bound to an
org_id, ensuring no cross-tenant context contamination - BullMQ + Redis (Upstash) — The async job queue (
docker/cloud-worker/) coordinates agent execution order and passes job payloads (which include document IDs and context pointers) between pipeline stages
7.4 Cloud vs. Local Agent Context
The docker-compose.yml defines two GPU profiles:
fullprofile — Runsollama(local LLM),llmproxy (port8001), embed service, and reranker locally. Agents in this mode operate with lower latency but constrained model capacity.gpuprofile — Runs embed + reranker + ollama. Cloud LLM calls go to Modal'sQwen2.5:32Bendpoint. The admin pipeline routing stats endpoint (/api/admin/routing-stats/route.ts) tracks which agents are dispatching to which backend.
The cloud worker (docker/cloud-worker/Dockerfile, deployed via Fly.io) handles asynchronous heavy analysis tasks (advisor pipeline, batch extraction) independently from the web process, preventing context-heavy operations from blocking the interactive chat experience.
8. RAG Pipeline as Context Infrastructure
The RAG system is the central context infrastructure for DataZoom. docs/RAG_SYSTEM.md is the canonical reference. The following summarizes the pipeline's role as context management:
8.1 Ingestion Phase (Context Construction)
Upload → Storage (Supabase Storage)
→ Text Extraction (Python worker)
→ Chunking (fixed-size with overlap)
→ Embedding (all-MiniLM-L6-v2 → 384-dim)
→ document_chunks INSERT (content + embedding + metadata JSONB)
→ documents.summary UPDATE (LLM extraction)
→ timeline_events INSERT (temporal extraction)
Each step builds a persistent context artifact. The metadata JSONB column on document_chunks carries per-chunk signals (page number, section header, chunk type) that the retrieval layer uses for re-ranking.
8.2 Retrieval Phase (Context Assembly)
User Query
→ Embed query (all-MiniLM-L6-v2)
→ Pre-filter (document_type, parties[], effective_date)
→ pgvector cosine similarity search (ivfflat, top-K)
→ Re-ranking (if reranker service active)
→ Structured fact augmentation (timeline_events JOIN)
→ Thread history prepend
→ Context packet → LLM
The product/lib/__tests__/rag-retrieval-enhanced.test.ts test covers the full retrieval-plus-reranking path. The standard rag-retrieval.test.ts covers the base similarity search path.
8.3 Reranker Service
The GPU profile in docker-compose.yml includes a reranker service. Reranking applies a cross-encoder model to the top-K cosine-similar chunks, reordering them by relevance to the specific query (not just semantic proximity). This is a high-value context density mechanism: it promotes the most answer-relevant chunks to the top of the context packet, where LLMs tend to pay more attention.
8.4 Citation System
Every chunk included in a context packet carries a traceable identity: document_id (UUID) → documents.filename and chunk_index. The LLM is instructed to cite using this identity. The citation system (product/lib/__tests__/citation-system.test.ts) validates that:
- LLM responses include citation markers for factual claims
- Citation markers resolve to real chunks in the database
- Chunks cited exist in the organization's document scope
This closes the context loop: users can trace every AI-generated claim back to its source document and chunk, making the context chain auditable.
9. Cross-Document ID Reference Map
The following table maps every major system component to its ID in the DataZoom taxonomy, enabling cross-reference from any documentation node.
9.1 Business Requirements
| ID | Requirement | Owner Component |
|---|---|---|
| BR-001 | Multi-tenant context isolation | Clerk auth + Supabase RLS |
| BR-002 | Every AI response must be source-cited | Citation system (citation-system.test.ts) |
| BR-003 | Context must persist across sessions | conversations, decision_log tables |
| BR-004 | Document analysis must support 5 document types | documents.document_type controlled vocab |
| BR-005 | Cap table data requires human review before persistence | Review workflow (/api/cap-table/review/) |
| BR-006 | LLM responses must be grounded in uploaded documents only | RAG retrieval + system prompt constraint |
9.2 Technical Components
| ID | Component | Path |
|---|---|---|
| TECH-001 | Vector similarity index | CREATE INDEX ON document_chunks USING ivfflat (embedding vector_cosine_ops) |
| TECH-002 | Embedding model | sentence-transformers/all-MiniLM-L6-v2 (384 dimensions) |
| TECH-003 | Primary LLM (cloud) | Modal endpoint, Qwen2.5:32B |
| TECH-004 | Primary LLM (local) | Ollama, port 11434, docker-compose.yml |
| TECH-005 | LLM proxy service | Port 8001, ghcr.io/midwestco/datazoom-base:latest |
| TECH-006 | Job queue | BullMQ + Redis (Upstash), TLS via `ssl_cert_reqs= |