Skip to content
RateStack
Capability · Observability

OpenTelemetry, correlationId, and a tamper-evident audit log.

Every request carries a correlationId from the moment it lands until the last webhook delivery. OpenTelemetry traces flow to your collector. The audit log is hash-chained — any post-write mutation is detectable.

Overview

What it is, in one paragraph

Every inbound API request mints a correlationId. It flows through Slf4j MDC, every NATS event payload, every webhook delivery header, every audit row. Micrometer Tracing exports OTLP to your collector. The platform's audit log is append-only with SHA-256 hash chaining — `previous_hash` and `entry_hash` per row — so any mutation after the fact is detectable by recomputing the chain.

  • correlationId everywhere

    Mint at API ingress, propagate via headers, MDC, NATS payload, webhook delivery, and audit row. Use it to join logs, traces, events, and webhooks.

  • OpenTelemetry export

    OTLP-format spans flow to your collector. Sample rate is configurable; default is 10%, force-sample on errors.

  • Audit hash chain

    common_audit_log is append-only. Each row carries previous_hash; entry_hash = SHA-256(previous_hash || canonical(row)). Verify with /v1/admin/audit/verify.

  • RFC 7807 errors carry it

    Every error response includes the correlationId field so support tickets translate directly into log queries.

  • PII redaction

    PiiRedactor strips emails, phones, SSN-shaped numbers, and PANs before logs and audit rows persist. Trace correlation works without leaking borrower identity.

  • Per-event metrics

    Micrometer counters / timers on every event class. Drop-shipped to the OTLP exporter; visible in any Prometheus-compatible backend.

How it works

The pipeline, end to end.

Numbered steps from input to output. Each step maps to a specific subsystem you can inspect via OpenTelemetry.

  1. 1

    Correlation at ingress

    api-service mints a correlationId on each request (or honors X-Correlation-Id if you supply one). It enters MDC immediately and flows through every span.

  2. 2

    Propagate through events

    When the engine emits pricing.computed on NATS, the payload header carries the correlationId. Downstream consumers (webhook-service, audit) preserve it on their own emissions.

  3. 3

    Tracing flows to your collector

    OTLP exporter ships spans on the configured sample rate. Errors are force-sampled. Run any compatible backend (Tempo, Honeycomb, Datadog APM).

  4. 4

    Audit writes are hashed

    Every state change writes an audit row with previous_hash → entry_hash. The hash is computed over the canonical-JSON form so reordering keys does not change the hash.

  5. 5

    Verify the chain

    GET /v1/admin/audit/verify recomputes the chain. {checked, firstBreakAt, ok}. Run it on a schedule from your monitoring.

  6. 6

    Tickets become log queries

    When a customer reports an issue, ask for the correlationId from the error response. One query, full timeline.

Hands on

An RFC 7807 error with correlationId

Live HTTP sample — copy, paste, ship.

# Every problem+json error includes the correlationId
HTTP/1.1 422 Unprocessable Entity
content-type: application/problem+json
{
  "type": "https://api.ratestack.com/errors/loan-validation",
  "title": "Loan failed validation",
  "status": 422,
  "detail": "Borrower FICO is required for conforming loans.",
  "instance": "/v1/pricing",
  "correlationId": "01J2K8RJYP4Z6N5GA2T9DH8C0H",
  "violations": [
    { "field": "borrowers[0].fico", "code": "REQUIRED", "message": "must be present" }
  ]
}

Why this matters

The pain it removes.

Joinable timelines

Pricing → lock → webhook → subscriber. One correlationId joins them all in your logs and your traces.

Defensible audit

Hash chain means a regulator's question 'has this row been mutated' has a binary answer, not a guess.

On-call sanity

Sampled OTLP traces + correlationId + the audit chain means most production incidents resolve from one console search.

Frequently asked

Direct answers, no marketing spin.

What backend do you recommend for traces?

Anything that speaks OTLP. We have happy customers on Tempo, Honeycomb, Datadog, and self-hosted Jaeger. The exporter is configurable per environment.

Does the audit chain prevent a malicious DBA from tampering?

It makes tampering detectable. A DBA who can update the table can also recompute the hash; what they cannot do is silently change a row without leaving a hash mismatch on the next row. We treat the verify endpoint as a hard check.

Can I supply my own correlationId?

Yes. Pass X-Correlation-Id on inbound requests. We honor it (after a syntax sanity check) and propagate. Useful when your LOS already mints one.

What's logged vs. what's audited?

Logs are operational and PII-redacted. Audits are state-change records with hash linking. Logs go to your log backend; audits live in MySQL forever (subject to retention policy).

Ready to see it on your data?

Wire opentelemetry, correlationid, and a tamper-evident audit log. up to your real workflow.

We'll spin you a sandbox, load your actual ratesheets, and walk you through this capability against your top scenarios.

Observability — OpenTelemetry, correlationId, audit chain | RateStack