Answers, not hallucinations
Citation-gated answers mean a desk operator can verify before acting. The trace link is the audit answer.
A conversational AI grounded on your lock, hedge, pricing, and exception data. Asks of the desk get cited answers with trace links, suggested resolutions for exceptions, and an audit-chained reply log. Tenant-scoped, rate-limited, never used to train models across customers.
Overview
The AI Desk Assistant is a conversational interface to your platform data — locks, exceptions, hedge view, pricing decisions, audit chain. Questions get grounded answers with citations into the source events. Suggested resolutions on exceptions land as a draft you can accept, edit, or reject. Every prompt and reply writes to an audit-chained AI log tagged with the operator's identity and the correlationIds of the cited records. Rate-limited at 30 rpm per tenant; never used to train models across customers.
Tenant-scoped retrieval over locks, exceptions, ratesheets, hedge view, and audit chain. The assistant cannot answer about another tenant's data.
Every claim cites correlationIds. Click to land on the source event in the audit chain — the assistant cannot hide its sources.
On the Exception Inbox, the assistant proposes a resolution as a draft. Resolve, dismiss, embed for context, or recompute — every action is audit-chained.
Token usage and call rate capped per tenant to keep cost predictable and prevent runaway prompts.
Operational tuning happens within your tenant data only. We never aggregate prompts or replies across customers.
Every prompt and reply lands in an audit-chained AI log with operator id, scope, citations, and tokens consumed.
How it works
Numbered steps from input to output. Each step maps to a specific subsystem you can inspect via OpenTelemetry.
Operator types a question in the assistant pane; or the Exception Inbox calls /v1/ai/assistant/suggest with the exception payload.
Retrieval queries the lock store, ratesheet versions, hedge view, and audit chain for relevant records — only within the operator's capability scope.
The model is prompted with the retrieved facts and instructed to cite every claim by correlationId. Hallucination guard rejects answers without citations.
The UI renders each citation as a link to the audit-chain trace. Clicking the link lands on the source event.
Prompt, retrieved record ids, model output, citations, and tokens consumed all land in the AI log. Hash-chained into common_audit_log.
On an exception: accept the suggested resolution, edit it, dismiss, or recompute. On a free-form question: drill into a citation or ask a follow-up in the same thread.
Hands on
Live cURL sample — copy, paste, ship.
# Ask the assistant a question over your desk's data
curl -X POST https://api.ratestack.com/v1/ai/assistant/ask \
-H "X-API-Key: $RATESTACK_KEY" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: $(uuidgen)" \
-d '{
"scope": "desk:primary",
"thread": null,
"message": "Why did pullthrough drop on the 7yr cohort this morning?"
}'
# Returns a cited answer:
# {
# "answer": "Pullthrough confidence on the 7yr cohort dropped 0.18 ...",
# "citations": [
# { "type": "lock", "id": "lck_8a7c", "correlationId": "c0c0…" },
# { "type": "ratesheet_version", "id": "rs_9b1a", "correlationId": "c0c0…" }
# ],
# "auditId": "ai_log_4f2c"
# }Why this matters
Citation-gated answers mean a desk operator can verify before acting. The trace link is the audit answer.
AI-suggested resolutions cut triage time on the Inbox. The operator stays in control — accept, edit, or override.
Every prompt and reply is logged with citations. Auditors get a complete picture of what AI was asked and what it claimed.
Frequently asked
No across-tenant training, ever. Operational tuning — feature embeddings, learned-template patterns — stays inside your tenant. We do not aggregate prompts or replies for shared model improvement.
Frontier large language model accessed via a vendor-neutral abstraction. The model can be pinned per environment; Enterprise customers can request a specific model floor for change-management reasons.
Today it suggests; operators act. Auto-apply is on the roadmap for narrow, audit-chained workflows — and will be opt-in, capability-gated.
The citation must resolve to a real record in your tenant. The UI verifies citations server-side before render; an unresolved citation degrades the reply and writes an audit warning.
Per-tenant rate cap (30 rpm default), per-call token cap, and a monthly cost ceiling on every tier. Exceeding the ceiling routes new prompts to a degraded mode (template-based response) rather than billing surprise.
Related capabilities
Ready to see it on your data?
We'll spin you a sandbox, load your actual ratesheets, and walk you through this capability against your top scenarios.