Setting up webhooks safely
Webhooks are the primary integration surface for downstream systems. Getting them wrong is the single most common production bug. Here's the playbook.
Webhook integrations fail in three classic ways: signature verification is skipped or wrong, idempotency is not enforced, and retries are not handled. Each of these is a one-line fix once you know the pattern. This guide covers all three, plus the operational practices that keep your subscriber from becoming a noisy neighbor.
1. Verify the signature
Every RateStack delivery carries an X-RateStack-Signature header (sha256 hex) and an X-RateStack-Timestamp header (Unix epoch seconds). The signature is HMAC-SHA256 over ${timestamp}.${body} using the subscription secret. Verify it. Constant-time compare. Reject if it does not match or if the timestamp is older than 5 minutes.
import crypto from "node:crypto";
export function verifyWebhook(
body: string,
headers: Record<string, string>,
secret: string,
): boolean {
const sig = headers["x-ratestack-signature"]?.replace(/^sha256=/, "") ?? "";
const ts = Number(headers["x-ratestack-timestamp"] ?? "0");
if (!Number.isFinite(ts)) return false;
if (Math.abs(Date.now() / 1000 - ts) > 300) return false;
const expected = crypto
.createHmac("sha256", secret)
.update(`${ts}.${body}`)
.digest("hex");
return crypto.timingSafeEqual(Buffer.from(expected), Buffer.from(sig));
}2. Idempotency on every consumer
Every delivery carries an X-RateStack-Event-Id. Use it as your idempotency key. Store the eventId in your database with a uniqueness constraint; if a delivery is retried, your insert fails on the unique constraint and you ack 200 (we ack the eventId, not the side effect).
Critically: ack 200 even on duplicate deliveries. If you 5xx on duplicates, RateStack will keep retrying. If you 200 on duplicates, RateStack moves on.
3. Handle retries
Retries are exponential: 30s, 1m, 2m, 5m, 15m, 30m, 1h, 4h. Up to 8 attempts. After that, the delivery moves to the DLQ. You should design your subscriber to be tolerant of out-of-order arrivals (because retries are out-of-order with respect to fresh events), but RateStack does guarantee in-order delivery per subscription on the happy path.
4. Don't do work in the request handler
Ack the webhook, enqueue the work. Synchronous heavy work in the webhook handler causes timeouts (RateStack times out at 30 seconds), which causes retries, which causes back-pressure. The pattern is: receive → verify → dedupe → enqueue → ack 200. The actual processing happens off the critical path.
5. Operational hygiene
Health-check your webhook endpoint and alert on the RateStack burndown (your delivery success rate is queryable via the API). When deploying subscribers, deploy the new version first, drain the old, then switch DNS — never the other way. The 5-minute timestamp window protects you from clock skew, but only if your clocks are in fact roughly in sync.
6. The DLQ is your friend
When something goes wrong, the DLQ is where the payloads end up. Replay works fine — the DLQ replay endpoint is rate-limited so it does not stampede your subscriber when you're recovering from a bad deploy. Make a habit of inspecting the DLQ weekly; it is the single best indicator of subscriber health.
Common mistakes
- Comparing signatures with
===(timing attack). - Skipping the timestamp window check.
- Idempotency by request body hash instead of eventId (false negatives on retries).
- Returning 5xx on duplicates.
- Synchronous external calls in the webhook handler.
Get these right and webhooks are boring. They'll be the most reliable part of your integration. Get them wrong and you'll spend a lot of time debugging things that the platform is not actually doing wrong.