Models and Routing

How EQUIRE routes AI workloads across model tiers, attributes usage to your organization, enforces zero-retention, and applies per-task timeouts.

EQUIRE runs every AI request through the Vercel AI Gateway when configured, with a direct-Anthropic fallback. Each workload is bound to a model tier (semantic alias) and a feature tag (workflow it originated from). Tier and tag together determine the exact model used, the data-handling guarantees applied, and how the request is attributed to your organization.

Model Tiers

The platform exposes five semantic aliases. Code calls a tier by name; the resolver picks the concrete model based on whether the Gateway is in play.

Tier	Purpose	Gateway model	Direct fallback
`chat`	General reasoning, deal Q&A, drafting	`anthropic/claude-sonnet-4.6`	`claude-sonnet-4-6`
`fast`	Cheap classification, gap detection, simple summaries	`anthropic/claude-haiku-4-5`	`claude-haiku-4-5`
`critical`	Highest-stakes reasoning — IC memo critical sections, expert opinion	`anthropic/claude-opus-4.7`	`claude-opus-4-7`
`extraction`	Document extraction with long context (up to 64k tokens)	`anthropic/claude-sonnet-4.6`	`claude-sonnet-4-6`
`verification`	Lightweight second-pass verification of extracted values	`anthropic/claude-haiku-4-5`	`claude-haiku-4-5`

Aliases let admins swap underlying models without touching feature code. Calling chat always returns the currently-blessed mid-tier model, even after a version bump.

Gateway versus Direct Fallback

Gateway (preferred) — used when AI_GATEWAY_API_KEY or a Vercel OIDC token is configured. Adds attribution, zero-retention, and unified routing across providers.
Direct Anthropic — used when only ANTHROPIC_API_KEY is set. Same models, but no Gateway-side attribution metadata; embeddings are not available on this fallback.

The platform does not silently mix providers. If you have only Anthropic configured, every request uses Anthropic; if the Gateway is configured, every request uses the Gateway.

How Features Map to Tiers

Every workload that calls the AI is tagged with a feature (from a closed enum). The tag drives both attribution and the default tier. Admin overrides (below) can change the model behind a feature without changing the tag.

Chat Surfaces

Feature tag	Default tier	Notes
`chat`	`chat` (Sonnet)	Deal-mode chat assistant
`portfolio-chat`	`chat` (Sonnet)	Portfolio-mode chat
`mandate-dashboard`	`chat` (Sonnet)	Mandate-dashboard advisor
`digest`	`chat` (Sonnet)	Prospecting digest narrative

Documents and Extraction

Feature tag	Default tier	Notes
`document-processing`	`extraction` then `verification`	Sonnet 64k for the extraction pass; Haiku 8k for the verification pass

document-processing is the umbrella tag for the entire ingestion pipeline. Extraction and verification share the same tag so usage reports show one line for the workflow rather than splitting across passes.

IC Memo

ic-memo mixes tiers section-by-section based on stakes:

Opus (critical) — Executive Summary, Investment Thesis, Market Analysis, Risk Factors
Sonnet (chat) — all other narrative sections
Haiku (fast) — section classification and gap detection

The mix is fixed in code rather than configurable per memo, so every memo gets the same provenance posture. Admin overrides can swap the model under any tier without changing the section-to-tier mapping.

Other Workflows

Feature tag	Default tier	Notes
`expert-opinion`	`critical` (Opus)	Per-assumption expert commentary
`valuation`	`chat` (Sonnet)	Valuation analyst pipeline
`deal-health-coherence`	`chat` (Sonnet)	Tier-2 coherence analyzer
`deal-health-deep-scan`	`chat` (Sonnet)	Tier-3 deep scan
`research`	`chat` or `fast`	Sonnet for narrative, Haiku for filtering
`origination`	`chat` (Sonnet)	Origination AI advisor and prospect briefs
`organization-enrichment`	`chat` (Sonnet)	Org research cache
`corbis-research`	`chat` (Sonnet)	Corbis MCP research agent
`deliverable`	`chat` (Sonnet)	Deliverable drafting
`image-gen`	n/a	Image generation routes through a separate provider, not the LLM stack

Anthropic-Only Features

A small set of features is restricted to Anthropic models even when the Gateway has alternatives configured:

document-processing
extraction (legacy alias retained for older usage rows)
deal-health-deep-scan

These features rely on capabilities — native PDF vision, long-context caching, multi-step reasoning depth — that are currently best-served by Anthropic. If an admin tries to override one of these to a non-Anthropic model, the override is silently rejected and the default tier is used. The admin UI shows the same restriction so the constraint is visible up front.

Admin Overrides

Platform admins can override the model behind any feature tag without changing application code. Overrides live in Supabase (credeals.platform_ai_config) and are read with a 60-second in-memory cache.

The admin UI exposes a closed allowlist of models that can be set as overrides:

anthropic/claude-opus-4.7
anthropic/claude-sonnet-4.6
anthropic/claude-haiku-4-5
openai/gpt-5.4
openai/gpt-5

OpenAI models are accepted for non–Anthropic-only features. Setting a model that violates the Anthropic-only constraint is silently rejected, and clearing an override returns the feature to its default tier on the next cache refresh.

Override changes take effect within 60 seconds; there is no application restart required.

Gateway Attribution

Every AI call goes out with attribution metadata so usage reports, audit trails, and cost analytics line up with the workflow that triggered the call.

What Gets Sent

feature — the workflow tag (e.g. ic-memo, valuation, chat)
org — your organization ID, omitted only for system-level calls with no org context
user — the user who initiated the action, when available
zeroDataRetention: true — always set, on every call

Telemetry Allowlist

EQUIRE writes telemetry events to its own observability layer alongside the Gateway. The allowed metadata keys are limited to a closed set including feature, orgId, surface, documentType, documentId, attachmentCount, pdfPageCount, tenantCount, confidence, verificationMode, readiness, mandateCount, steps, and hitStepLimit. Anything outside the allowlist is dropped before being recorded.

Crucially, prompts and completions are never persisted in telemetry. recordInputs and recordOutputs are hard-coded to false. The only place a prompt or completion lives is in transit — and the Gateway is configured for zero retention.

Zero Data Retention

The Gateway is configured to not store prompts or completions. Every request carries zeroDataRetention: true, which the Gateway honors by short-circuiting any request-body logging or training-set capture on the upstream provider side.

In practical terms:

Your prompts and the model's responses are not retained by the Gateway after the response is delivered.
They are not used to train any model.
They are not visible in cross-org analytics.

EQUIRE's own database stores the outputs of AI work (extracted fields, IC memo text, valuation assumptions, audit log entries) where they are needed for the product. Those outputs live under your org's RLS scope and are deleted when you delete the underlying deal or scheduled account-level deletion runs.

Timeouts

Every AI call carries an explicit timeout drawn from a small set of presets. Total timeout is the upper bound for the whole call; chunk timeout is the maximum gap between streamed tokens before the call is aborted.

Preset	Total	Chunk	Used for
`quick`	15s	5s	Lightweight classification, gap checks
`standard`	60s	15s	Most chat, narrative, and deal-tool calls
`verification`	60s	15s	Haiku verification pass on extracted documents
`pdfVerification`	180s	45s	PDF vision verification — vision TTFT can be 20–45s for 30–100 page OMs before tokens flow
`extraction`	300s	60s	Long-context document extraction (up to 64k output tokens)

pdfVerification is the longest preset by design. Native-PDF vision ingestion has a long time-to-first-token before any streaming starts, so a shorter preset would abort valid calls mid-think.

Embeddings

When the Gateway is configured, embeddings use openai/text-embedding-3-small (1536 dimensions). They carry the same feature, orgId, and userId attribution as LLM calls.

Direct Anthropic does not provide an embeddings endpoint. When only the Anthropic fallback is in place, embedding calls fail soft — the helper returns an empty array and any features that require embeddings degrade gracefully (text search falls back to keyword matching, research narratives skip the semantic-similarity stage). Production deployments should always configure the Gateway so embeddings are available.

Where to Go Next

For the data-handling and human-in-the-loop posture of every AI surface, see Trust and safety.
For which tools each chat mode exposes, see Tool reference.
For specialist agents that drive the bulk of document-processing and corbis-research traffic, see Specialist agents.

Models and Routing

On this page