Live project memory
Live project memory
Last updated 5/24/2026
Colony — Live Project Memory
Document version: 2026-04-28 (post Phase 2 wave execution) Methodology: PRD-driven Context Engineering v0.1→v1.0 Owner: midwestco/colony
1. Project Identity
Colony is the command interface for the ZoomProp GTM Agentic Stack. It is a CRM-plus-agent-control-plane: a single surface where founders orchestrate outbound, content, pipeline management, recording intelligence, onboarding, and analytics through a streaming chat interface backed by specialist AI agents. The canonical spec lives at docs/ZP_GTM_Agentic_Stack_Draft_V1.pdf.
Stage: Active multi-phase build. Phase 1 (foundation + all specialist agents) is structurally complete. Phase 2 (discovery expansion, multi-channel, deliverability hardening, org management) is in execution with test scaffolding written and batch runner artifacts present.
2. Current Project State
2.1 What Has Been Built (Phase 1 — Structurally Complete)
| Area | Status | Key Artifacts |
|---|---|---|
| GCP Infrastructure (Terraform) | ✅ Provisioned | infrastructure/gcp/ — Cloud Run, Cloud SQL Postgres 16, GCS, IAM, Secrets, tfstate committed |
| Cloud SQL Postgres 16 | ✅ Running | colony-39989:…:colony; pgvector (1536-dim), pgcrypto extensions provisioned via infrastructure/postgres/init/02-extensions.sql |
| GCS Asset Bucket | ✅ Active | gs://colony-assets; CMEK + signed URLs; defined in infrastructure/gcp/storage.tf |
| Clerk Auth + RBAC | ✅ Implemented | 3 roles, org-scoped; Svix webhook real-time sync; action plan at docs/phase1/action_plans/02_clerk_auth_and_rbac.md |
| Colony Vault (per-org API keys) | ✅ Implemented | KMS-encrypted rows in api_keys table; action plan at docs/phase1/action_plans/03_api_keys_and_integrations_vault.md |
| Inngest Agent Runtime | ✅ Wired | Durable functions for all agent loops; action plan at docs/phase1/action_plans/04_inngest_agent_runtime.md |
| LLM Clients + Observability | ✅ Implemented | Langfuse tracing, Sentry error tracking; docs/phase1/action_plans/05_llm_clients_and_observability.md |
| Pipedrive Integration | ✅ Bi-sync active | 9-stage pipeline, ICP scoring, custom fields; docs/phase1/action_plans/06_pipedrive_integration.md; runbook at docs/phase1/runbooks/pipedrive-fields.md |
| Unipile Integration | ✅ Implemented | LinkedIn + email outreach channels; docs/phase1/action_plans/07_unipile_integration.md |
| Resend Integration | ✅ Implemented | Email delivery; docs/phase1/action_plans/08_resend_integration.md |
| Google Drive Integration | ✅ Implemented | Gemini Meet Notes ingestion; docs/phase1/action_plans/09_google_drive_integration.md |
| GCS Storage Layer | ✅ Implemented | Signed URL generation, CMEK; docs/phase1/action_plans/10_gcs_storage.md |
| Prospect Agent | ✅ Implemented | docs/phase1/action_plans/11_prospect_agent.md |
| Qualification Agent | ✅ Implemented | ICP scoring pipeline; docs/phase1/action_plans/12_qualification_agent.md |
| Message Generator Agent | ✅ Implemented | 5 angles (signal-, pain-, referral-, pattern-break-, insight-led); docs/phase1/action_plans/13_message_generator_agent.md |
| Recording Intelligence Agent | ✅ Implemented | Signals extracted → CRM fields + Knowledge Core; docs/phase1/action_plans/14_recording_intelligence_agent.md |
| Post-Call Agent | ✅ Implemented | docs/phase1/action_plans/15_post_call_agent.md |
| Content Agent | ✅ Implemented | 6 content pillars, attribution → pipeline influence; docs/phase1/action_plans/16_content_agent.md |
| Onboarding Agent | ✅ Implemented | 8-asset Deployment Kit on Closed-Won; docs/phase1/action_plans/17_onboarding_agent.md |
| Analytics Agent + Daily Brief | ✅ Implemented | Morning brief to inbox + /overview; docs/phase1/action_plans/18_analytics_agent.md |
| GTM Orchestrator + Campaign Orchestrator | ✅ Implemented | Multi-turn tool loop, SSE streaming; docs/phase1/action_plans/19_orchestrators.md |
| Outbound Sequence Engine | ✅ Implemented | T1–T5 sequences, HOT reply queue, per-org circuit breakers; docs/phase1/action_plans/20_outbound_sequence_engine.md |
| Approval Queue | ✅ Implemented | docs/phase1/action_plans/21_approval_queue.md |
| Circuit Breakers | ✅ Implemented | Per-org; docs/phase1/action_plans/22_circuit_breakers.md |
| Daily Briefs | ✅ Implemented | docs/phase1/action_plans/23_daily_briefs.md |
| Knowledge Core Editor | ✅ Implemented | 10-domain, pgvector retrieval; docs/phase1/action_plans/24_knowledge_core_editor.md |
Command Interface Chat (/overview) | ✅ Implemented | SSE streaming orchestrator; docs/phase1/action_plans/25_command_interface_chat.md |
| Deployment Kit | ✅ Implemented | docs/phase1/action_plans/26_deployment_kit.md |
2.2 Phase 2 — Current Execution State
Phase 2 planning is fully documented across 27 action plans in docs/phase2/action_plans/. Test specs for all 27 items are written in docs/phase2/testing/. Batch runner artifacts in runners/.artifacts/ confirm active execution as of 2026-04-28.
| Phase 2 Item | Action Plan | Test Spec | Runner Evidence |
|---|---|---|---|
| Schema extensions | 01_schema_extensions.md | 01_schema_extensions.spec.ts | — |
| Pipedrive LeadBooster scraper | 02_pipedrive_leadbooster_scraper.md | 02_e2e_pipedrive_scraper.spec.ts | ✅ Multiple batch runs (16:31–18:48 UTC 2026-04-28) |
| Google Places integration | 03_google_places_integration.md | 03_e2e_google_places.spec.ts | — |
| SerpAPI integration | 04_serpapi_integration.md | 04_e2e_serpapi.spec.ts | — |
| Public data adapters | 05_public_data_adapters.md | 05_e2e_public_data.spec.ts | — |
| Unipile search extension | 06_unipile_search_extension.md | 06_e2e_unipile_search.spec.ts | — |
| Playwright generic runner | 07_playwright_generic_runner.md | 07_e2e_playwright_runner.spec.ts | ✅ Session logs, screenshots (dry-run artifacts) |
| Matching algorithm | 08_matching_algorithm.md | 08_e2e_matching_algorithm.spec.ts | — |
| Candidate approval queue | 09_candidate_approval_queue.md | 09_e2e_candidate_queue.spec.ts | — |
| Discovery orchestrator | 10_discovery_orchestrator.md | 10_e2e_discovery_orchestrator.spec.ts | — |
| Circuit breakers Phase 2 | 11_circuit_breakers_phase2.md | 11_breakers_phase2.spec.ts + 11_e2e_breakers.spec.ts | — |
| Google OAuth scope expansion | 12_google_oauth_scope_expansion.md | 12_google_oauth_scope_expansion.spec.ts | — |
| Gmail integration | 13_gmail_integration.md | 13_gmail_e2e.spec.ts | — |
| Calendar integration | 14_calendar_integration.md | 14_calendar_e2e.spec.ts | — |
| Notion KC sync | 15_notion_kc_sync.md | 15_notion_kc_sync.spec.ts | — |
| Anti-ICP hard prefilter | 16_anti_icp_hard_prefilter.md | 16_anti_icp.spec.ts | — |
| Outreach pace auto-approve | 17_outreach_pace_auto_approve.md | 17_auto_approve.spec.ts | — |
| Humanized post-call writer | 18_humanized_post_call_writer.md | 18_humanized_post_call.spec.ts | — |
| Email deliverability + warmup | 19_email_deliverability_and_warmup.md | 19_email_deliverability.spec.ts | — |
| Google Chat push | 20_google_chat_push.md | 20_google_chat.spec.ts | — |
| GCloud operator runbook | 21_gcloud_operator_runbook.md | 21_gcloud_operator_runbook.spec.ts | — |
| Healthcare ICP signal extraction | 22_healthcare_icp_signal_extraction.md | 22_healthcare_signals.spec.ts + 22_contacts_org_detail_api.spec.ts | — |
| Multi-channel sequence engine | 23_multi_channel_sequence_engine.md | 23_multichannel_sequence.spec.ts + 23_pipedrive_reveal_persistence.spec.ts | — |
| Contact list builder | 24_contact_list_builder.md | 24_contact_lists.spec.ts | — |
| Sender mailbox rotation | 25_sender_mailbox_rotation.md | 25_mailbox_rotation.spec.ts | — |
| CSV upload backboard source | 26_csv_upload_backboard_source.md | 26_csv_import.spec.ts | — |
| Org switcher hardening | 27_org_switcher_hardening.md | 27_org_switcher.spec.ts | — |
Key finding: Items 02 (Pipedrive LeadBooster scraper) and 07 (Playwright generic runner) have confirmed active runner execution. The remaining 25 Phase 2 items have test scaffolding written but no runner artifact evidence yet.
2.3 What Is In Progress
- TECH-001 Pipedrive LeadBooster batch scraping — Five batch runs executed on 2026-04-28, timestamps ranging 16:31–18:48 UTC. Artifacts include
batch.log,collect-N.log,import-N.log, andsummary.jsonper run. The evolution from dry-run (captures + screenshots) to full batch (import logs) indicates the pipeline matured intraday. - TECH-002 Playwright generic runner — Dry-run sessions produced
candidates.json,captures.json,organizations.json,people.json, andfinal.pngscreenshots. Session progressed from partial (16:31 — no organizations/people JSON) to full data capture (16:39 — all files present). - TECH-003 Phase 2 wave orchestration —
docs/phase2/action_plans/00a_WAVE_ORCHESTRATION.mdexists, indicating deliberate sequencing. Items are being executed in waves rather than all at once.
2.4 What Is Blocked
| Block ID | Description | Blocker | Dependent Items |
|---|---|---|---|
| BR-001 | Google OAuth scope expansion | Requires manual Google Workspace admin steps (docs/phase2/runbooks/02_workspace_manual_steps.md) | Gmail (AP-13), Calendar (AP-14), Google Chat (AP-20) |
| BR-002 | Notion KC sync | External Notion API key per-org; org vault configuration required | AP-15 |
| BR-003 | Email deliverability + warmup | Requires DNS record changes (SPF/DKIM/DMARC) outside automated pipeline | AP-19, Mailbox rotation (AP-25) |
| BR-004 | Healthcare ICP signal extraction | Domain-specific signal taxonomy not yet confirmed | AP-22 |
3. Architecture Snapshot
3.1 Current Architecture (as deployed)
Clerk (Auth) ──Svix webhook──► Postgres 16 (colony-39989:…:colony)
├── pgvector (1536-dim embeddings)
├── pgcrypto (vault encryption)
└── api_keys (KMS-encrypted per-org)
Next.js 16 / React 19
├── /overview ── SSE streaming ──► GTM Orchestrator (Inngest)
└── App Router ├── Campaign Orchestrator
└── Specialist Agents (×8)
GCP Secret Manager (platform secrets)
Colony Vault (per-org: Pipedrive, Unipile, Resend, Google, Notion)
Integrations: Pipedrive · Unipile · Resend · Google Drive · Langfuse · Sentry
Storage: gs://colony-assets (CMEK + signed URLs)
3.2 How Architecture Has Evolved
Epoch 1 (pre-Phase 1): Pure spec. Architecture lived only in docs/ZP_GTM_Agentic_Stack_Draft_V1.pdf.
Epoch 2 (Phase 1 execution): All 26 action plans executed. Infrastructure fully Terraform-managed (infrastructure/gcp/). Two-secret-store pattern locked in: GCP Secret Manager for platform keys, Colony Vault for per-org keys. This was a deliberate security decision to prevent cross-org credential leakage.
Epoch 3 (Phase 2 — current): Discovery stack is being layered on top. Playwright is now a first-class infrastructure component (not a test-only tool), running as a generic scraping runner in runners/. The runners/.artifacts/ directory structure emerging intraday (2026-04-28) represents a new runtime artifact pattern — batch jobs now produce structured logs + summary JSON that feed back into the pipeline.
TECH-ARCH-001 Key architectural decision: Playwright as a production runner (not just CI test harness) was validated on 2026-04-28 with the Pipedrive LeadBooster scraper. The .github/workflows/playwright-runner.yml workflow formalizes this.
3.3 Two-Secret-Store Pattern (Active Decision)
Decision TECH-DEC-001: Maintain two separate secret stores.
- GCP Secret Manager — Clerk, Inngest, bootstrap LLM keys, Langfuse, Sentry. Injected at deploy time via
gcloud run deploy --update-secrets=…. - Colony Vault — Per-org credentials (Pipedrive, Unipile, Resend, Google OAuth tokens). Stored as KMS-encrypted rows in
api_keystable.
Rationale: Prevents a compromised org token from escalating to platform-level access. GCP Secret Manager has no concept of org-scoping; the vault pattern makes org isolation enforceable at the data layer.
Trade-off unresolved: KMS key rotation cadence for Colony Vault rows is not yet documented in any runbook. This is a latent TECH-DEBT item.
4. Active Decisions and Rationale
| Decision ID | Decision | Rationale | Status |
|---|---|---|---|
| TECH-DEC-001 | Two-secret-store pattern | Org isolation; see §3.3 | Locked |
| TECH-DEC-002 | Inngest for agent durable functions | Handles multi-turn tool loops with retries, concurrency limits, and replay without custom queue infrastructure | Locked |
| TECH-DEC-003 | SSE (not WebSockets) for chat streaming | Next.js App Router native support; simpler infrastructure; acceptable for unidirectional agent→client stream | Locked |
| TECH-DEC-004 | pgvector at 1536 dimensions | OpenAI text-embedding-3-small / text-embedding-ada-002 output dimension; enables drop-in compatibility | Locked |
| TECH-DEC-005 | Playwright as production runner (not only test tool) | Discovery use cases require real browser execution against authenticated SaaS (Pipedrive LeadBooster); headless API alternatives don't exist | Active — validated 2026-04-28 |
| TECH-DEC-006 | Terraform for all GCP infra | Reproducible environments; tfstate committed (NOTE: this is a debt item — see §7) | Locked |
| TECH-DEC-007 | Wave orchestration for Phase 2 | 00a_WAVE_ORCHESTRATION.md — sequencing dependencies prevents partial states; e.g., schema extensions (AP-01) must land before discovery orchestrator (AP-10) | Active |
| TECH-DEC-008 | Codacy for static analysis | .codacy/ config present with ESLint, Semgrep, Trivy, Lizard; CI-enforced | Active |
| TECH-DEC-009 | RBAC enforcement in CI | .github/workflows/rbac-check.yml runs on every push; prevents role regression | Locked |
| TECH-DEC-010 | Terraform provider lock at google/google-beta 7.29.0 | .terraform.lock.hcl pinned; prevents provider drift breaking infra plans | Locked |
5. Recent Changes and Their Impact
2026-04-28 — Pipedrive Batch Runner Goes Live
Change: First full batch run of the Pipedrive LeadBooster scraper executed at 16:31 UTC, followed by four more runs through 18:48 UTC. Each run produced collect + import logs and a summary.json.
Impact:
- Confirms
runners/as a new production runtime directory, not just a scratch space. - The progression from dry-run (no import logs) → batch (import logs present) in a single day indicates rapid iteration.
- Import artifacts (
import-0.log,import-1.log, etc.) in runs 2 and 5 confirm data is actually landing in the database. - Runs 3 and 4 (18:19, 18:22) have only
batch.log+collect-0.logwith no import logs — these may represent partial runs or runs that failed at the import stage. This is an open question (OQ-001).
2026-04-28 — Playwright Dry-Run Artifacts Evolve
Change: Three Playwright dry-run sessions ran between 16:31 and 16:41 UTC. Session at 16:39 was the first to produce organizations.json and people.json alongside candidates.json and captures.json.
Impact:
- Validates that the Playwright runner can extract structured entity data (orgs + people), not just raw page captures.
final.pngscreenshots provide visual regression baseline for runner sessions.
2026-04-28 — Playwright MCP Console Log
Change: .playwright-mcp/console-2026-04-28T16-11-43-345Z.log and .playwright-mcp/page-2026-04-28T16-11-45-519Z.yml created.
Impact: Suggests Playwright MCP (Model Context Protocol) tooling is being used interactively, likely for developing/debugging runner scripts before committing them. This is a new development workflow pattern.
6. Open Questions and Unresolved Trade-offs
| OQ ID | Question | Context | Priority |
|---|---|---|---|
| OQ-001 | Why did batch runs at 18:19 and 18:22 produce no import logs? | Only batch.log + collect-0.log present. May be early termination, rate limiting, or intentional partial runs. | High |
| OQ-002 | Is terraform.tfstate intentionally committed to the repository? | infrastructure/gcp/terraform.tfstate and .tfstate.backup are present in the file tree. This is a security and collaboration risk — state contains resource IDs and potentially sensitive output values. | Critical |
| OQ-003 | What is the KMS key rotation cadence for Colony Vault? | No runbook exists for this. An unrotated KMS key that is compromised exposes all per-org credentials. | High |
| OQ-004 | Is .env committed to the repository? | .env appears in the file tree at root. If this contains real credentials, it is a critical security issue. | Critical |
| OQ-005 | How does the Playwright runner authenticate to Pipedrive LeadBooster in CI? | The .github/workflows/playwright-runner.yml exists but the mechanism for injecting Pipedrive credentials into the runner (from Colony Vault vs. direct CI secrets) is not visible in the provided context. | High |
| OQ-006 | What is the deduplication strategy for the candidate approval queue (AP-09) when the same prospect appears across Pipedrive LeadBooster, Google Places, SerpAPI, and Unipile discovery paths? | The matching algorithm (AP-08) spec exists but the exact deduplication key (email? LinkedIn URL? org+name fuzzy match?) is undocumented at this level. | High |
| OQ-007 | Is there a rate-limit or ToS compliance strategy for Playwright-based Pipedrive scraping? | Pipedrive LeadBooster is a paid SaaS product. Automated scraping may violate ToS or trigger account suspension. | High |
| OQ-008 | What happens to daily briefs if the Inngest function fails or is delayed? | No dead-letter or fallback delivery mechanism is visible in the action plan summaries. | Medium |
| OQ-009 | How are the 8 Deployment Kit assets (pitch deck, one-pager, etc.) stored — in GCS, Postgres, or Pipedrive? | The onboarding agent action plan references generation but the storage target and retrieval path are not confirmed. | Medium |
| OQ-010 | Is the duplicate 11_breakers_phase2.spec.ts / 11_e2e_breakers.spec.ts intentional or an artifact of test file naming? | Two spec files exist for circuit breakers Phase 2. This may indicate a split between unit-style and e2e-style tests, or it may be an accidental duplicate. | Low |
7. Technical Debt Inventory
| DEBT ID | Description | Location | Priority | Impact if Unaddressed |
|---|---|---|---|---|
| DEBT-001 | Terraform state committed to repo | infrastructure/gcp/terraform.tfstate, terraform.tfstate.backup | 🔴 Critical | State leaks resource IDs; should live in GCS backend with state locking |
| DEBT-002 | .env file in repository root | .env | 🔴 Critical | If real credentials are present, immediate secret rotation required |
| DEBT-003 | No KMS key rotation runbook | Missing from docs/phase1/runbooks/ and docs/phase2/runbooks/ | 🔴 High | Compromised KMS key exposes all org vaults |
| DEBT-004 | Runner artifacts committed to repo | runners/.artifacts/ — contains batch logs, JSON data, screenshots | 🟡 Medium | Repo bloat; potential data leakage (prospect data in candidates.json, organizations.json, people.json) |
| DEBT-005 | Playwright MCP artifacts committed | .playwright-mcp/console-*.log, .playwright-mcp/page-*.yml | 🟡 Medium | Session logs may contain sensitive page content |
| DEBT-006 | No UI components or API routes in codebase | File tree scan shows 0 API routes, 0 UI components | 🟡 Medium | Suggests the Next.js application code may not be in this repository or the scan missed subdirectories |
| DEBT-007 | Duplicate Phase 2 circuit breaker test files | docs/phase2/testing/11_breakers_phase2.spec.ts and 11_e2e_breakers.spec.ts | 🟢 Low | Test confusion; one should be removed or clearly differentiated |
| DEBT-008 | Test specs in docs/ not __tests__/ or tests/ | All Phase 1 and Phase 2 test files under docs/phase*/testing/ | 🟡 Medium | Tests not discoverable by standard test runners (Jest, Vitest); may not execute in CI |
| DEBT-009 | No API route count despite complex agent system | Repository scan reports 0 API routes | 🟡 Medium | Either routes are outside scan scope or the backend is not yet scaffolded |
| DEBT-010 | tfplan binary committed | infrastructure/gcp/tfplan | 🟢 Low | Plan output is not sensitive but adds unnecessary binary blobs to git history |
8. Key Learnings from Recent Development Sessions
LEARN-001 — Playwright as a first-class production tool, not a test-only dependency.
The decision to use Playwright for Pipedrive LeadBooster scraping (Phase 2, AP-07/02) confirms that in the agentic GTM context, browser automation is a data-collection primitive. The .github/workflows/playwright-runner.yml workflow formalizes this. Future agent capabilities that require authenticated SaaS data extraction should default to the Playwright runner pattern before assuming an API exists.
LEARN-002 — Batch runner maturation happens fast. Five batch runs in under 2.5 hours (16:31–18:48 UTC) with evolving artifact structures indicates a rapid iteration loop. The team moved from dry-run validation to full import within the same working session. This suggests the Playwright + batch runner pattern has low iteration cost once the initial session structure is established.
LEARN-003 — Artifact directories need a retention and gitignore policy immediately.
Runner artifacts accumulating in runners/.artifacts/ and .playwright-mcp/ represent a real data governance gap. Prospect data (organizations.json, people.json, candidates.json) from real Pipedrive accounts should not live in git history. This is both a GDPR-adjacent concern and a repo hygiene issue.
LEARN-004 — Wave orchestration reduces integration failures.
The explicit 00a_WAVE_ORCHESTRATION.md document for Phase 2 reflects a learned pattern from Phase 1: running all 26 action plans in parallel creates dependency failures (e.g., agents trying to call Knowledge Core before pgvector is seeded). Sequenced waves with explicit dependency gates reduce rework.
LEARN-005 — Codacy static analysis is configured but its CI integration health is unknown.
.codacy/ is fully configured (ESLint, Semgrep, Trivy, Lizard, Pylint, Revive) but the actual CI trigger is not visible in the provided GitHub Actions workflows (only playwright-runner.yml and rbac-check.yml are listed). Codacy may be running as a webhook-triggered check rather than a workflow step.
LEARN-006 — The two-secret-store architecture creates a bootstrapping dependency. When a new org joins Colony, their per-org credentials must be added to the Colony Vault before any agent can act on their behalf. This onboarding step is manual today and is not represented in the Deployment Kit generation flow. It should be.
9. Team Conventions and Patterns
9.1 Documentation Conventions
- Action plans live in
docs/phaseN/action_plans/— numbered 00 (overview/orchestration) through N (feature). Each plan is a standalone markdown document with implementation instructions. - Test specs mirror action plans 1:1 in
docs/phaseN/testing/with matching numeric prefixes. Spec files use TypeScript (.spec.ts) for Phase 2; Phase 1 used mixed.spec.md(prose specs) and.spec.ts. - Runbooks live in
docs/phaseN/runbooks/— operational procedures for manual steps (e.g., DNS configuration, Google Workspace admin, Pipedrive field setup). - ID prefix convention: Action plans use numeric prefixes (00–27). No BR/UJ/API/DBT-style IDs have been adopted in the codebase documentation yet — these appear only in external PRD tooling.
9.2 Infrastructure Conventions
- Terraform providers pinned via
.terraform.lock.hcl— noterraform init -upgradewithout explicit decision. - All GCP resources managed via Terraform in
infrastructure/gcp/. Direct console changes are treated as drift. - Local Postgres for development via
infrastructure/docker-compose.yml+infrastructure/postgres/init/. - Terraform variables in
infrastructure/gcp/terraform.tfvars(notterraform.tfvars.example— actual values file present, which reinforces OQ-002/DEBT-001 concerns).
9.3 Agent Runtime Conventions
- Every specialist agent has an action plan before implementation begins.
- Inngest durable functions are the execution primitive for all multi-turn agent loops.
- Circuit breakers are per-org — never global. This prevents one org's bad data or runaway agent from affecting another org's sending reputation or API quotas.
- Knowledge Core is consulted by every message/content/brief generation — exemplar retrieval via pgvector is a mandatory step, not optional.
9.4 Security Conventions
- KMS encryption at rest for all per-org API keys in Colony Vault.
- Signed URLs (not public reads) for all GCS asset access.
- CMEK for GCS bucket — customer-managed encryption keys via GCP KMS.
- RBAC enforced in CI via
.github/workflows/rbac-check.yml— no bypass.
9.5 Runner/Scraper Conventions (Emerging — Phase 2)
- Dry-run before batch — Playwright sessions run in dry-run mode first, producing
candidates.jsonandcaptures.jsonfor validation, before batch mode with import logs. - Timestamped artifact directories — each run gets a UTC ISO timestamp directory under
runners/.artifacts/[job-type]/. summary.jsonis the canonical output of a completed batch run — downstream processes should read this, not individual collect/import logs.
10. Integration Points and Current Health
| Integration | Direction | Auth Method | Health | Notes |
|---|---|---|---|---|
| Clerk | Inbound auth | Clerk JWT | ✅ Healthy | Svix webhook for real-time org/user sync |
| Pipedrive | Bi-directional | Per-org API key (Colony Vault) | ✅ Active (batch runner confirmed) | 9-stage sync; also scraped via Playwright LeadBooster |
| Unipile | Outbound | Per-org API key (Colony Vault) | ✅ Implemented | LinkedIn + email; Phase 2 extends search capability |
| Resend | Outbound (email) | Per-org API key (Colony Vault) | ✅ Implemented | Runbook at docs/phase1/runbooks/resend-setup.md |
| Google Drive | Inbound (recordings) | Google OAuth (per-org) | ✅ Implemented | Gemini Meet Notes ingestion |
| Google OAuth (expanded) | Inbound | Google OAuth scopes | 🟡 In progress | BR-001 — manual Workspace admin steps required |
| Gmail | Outbound/Inbound | Google OAuth | 🟡 Blocked (BR-001) | Depends on OAuth scope expansion (AP-12) |
| Google Calendar | Inbound | Google OAuth | 🟡 Blocked (BR-001) | Depends on OAuth scope expansion (AP-12) |
| Google Chat | Outbound (push) | Google OAuth | 🟡 Blocked (BR-001) | AP-20 |
GCS (gs://colony-assets) | Read/Write | Service account |