Pillar · Loan-level pricing

Loan-level pricing, end to end

What a modern mortgage pricing engine actually does — from ratesheet ingestion through eligibility, ladder execution, margin layering, and the per-rule trace that keeps the answer defensible.

RTBy RateStack TeamPublished2026-05-03Reviewed2026-05-0318 min read

What loan-level pricing is

Loan-level pricing is the act of converting a specific loan profile — borrower attributes, loan attributes, property attributes, transaction attributes, and execution attributes — into a set of investor-specific quotes. Each quote is a tuple of (note rate, base price, total adjustments, final price) plus the ordered rule chain that produced it. The engine does this against every eligible investor in parallel and returns the result ranked by execution criteria.

The phrase "loan-level" matters. Servicing-released bulk pricing and bid-tape pricing happen at the loan-pool level — a different problem with different inputs. Loan-level pricing is what an originator runs at the moment of borrower interaction.

Inputs

A canonical loan profile in a modern engine consists of five facets:

Borrower: FICO, citizenship/residency, self-employment, income for affordability programs.
Loan: amount, type (conforming, jumbo, FHA, VA, USDA, non-QM), amortization (fixed, ARM index/margin), term in months.
Property: type (SFR, condo, 2-4 unit, manufactured), occupancy (primary, second, investment), state, county, appraised value.
Transaction: purchase, rate-and-term refinance, cash-out refinance, with corresponding sub-attributes.
Execution: lock period in days, lender-paid versus borrower-paid mortgage insurance, buydown structure.

These five facets together are sufficient input for the eligibility predicates and adjustment rules of nearly every program in production. Where a program needs something more (cash reserves for jumbos, time on current job for non-QM, etc.), it's a custom field on the borrower facet — a row of configuration, not engine code.

Stage 1: eligibility

Pricing is wasteful work to run against programs the borrower can't take. Modern engines run a cheap first-pass against every program's eligibility predicates: occupancy match, loan-type match, FICO band, LTV range, DTI ceiling, document type, state allowlist, AMI cap, and any custom field the program declares. Failures are noted with the specific predicate that failed and the threshold versus the loan's value. Survivors progress to Stage 2.

The downstream value of Stage 1 isn't latency — it's visibility. When an LO can see in seconds that an investor isn't eligible because of a 1% LTV gap, the conversation shifts from "why didn't this investor quote?" to "here's the adjustment we need to make." Lock-day surprises drop.

Stage 2: ladder execution

For each surviving program, the engine loads the active ratesheet (cached aggressively; see the caching section) and either looks up the base price at the requested note rate or scans the rate ladder around a target. The base price is the starting point.

From there, the engine walks the program's adjustment rules in their declared order, checking each rule's condition against the loan facets and applying the matched rule's value with its declared combine strategy. Adjustments may be in basis points (bp) or in price points; the engine normalizes both into a price contribution that's additive to the running cumulative.

Combine strategies

Rules don't always sum. Different mortgage products use different combine strategies, and a serious engine supports them all:

SUM: the rule's value adds to the running cumulative. The most common case.
MAX: among rules in this group, take the largest. Used for capped overlays where the worst single hit applies.
MIN: among rules in this group, take the smallest (most negative or least beneficial). Mirrors MAX semantically.
OVERRIDE: this rule replaces the running cumulative with its value, ignoring prior rule contributions. Used for special programs that supersede normal pricing.
REPLACE_DIMENSION: replaces a specific dimension of the price (e.g., rate-only or price-only) without affecting the other. Useful for product-specific overlays.

Rules use simple operators against fields: EQ, NEQ, LT, LTE, GT, GTE, BETWEEN, IN, NOT_IN, EXISTS, IS_NULL, and CONTAINS — composed with AND/OR. The combination is sufficient to express the full GSE LLPA grids (Fannie Mae and Freddie Mac publish their LLPAs as conditional grids on FICO × LTV × occupancy × purpose × product), as well as investor overlays.

Margin layering

Investor pricing produces a wholesale view of the price. Retail pricing adds the lender's margin. Margin in a serious engine is itself data-driven: rules attach at the platform / org / entity / loan-officer levels with effective-dated inheritance. (Entity covers BRANCH, REGION, TEAM, DIVISION, and CORPORATE so multi-state and multi-layer orgs can model their structure without contortions.) At pricing time the engine resolves the applicable margin stack for the loan's officer + entity + org as of the relevant timestamp, then layers each rule with its declared combine strategy. The trace shows which level contributed how much, in what order — important for compensation accounting and for answering "why does this entity see a different price than that branch."

Pricing modes: BEST_EX, BY_RATE, BY_PRICE

One engine, three views:

BEST_EX (best execution): for each eligible investor, the engine runs the full ladder and picks the (rate, price) pair that maximizes the final price after all adjustments and margin layers. The result ranks investors top-down. This is the LO's default.
BY_RATE: pin a rate. For each eligible investor, the engine returns the price at that rate (or the rate ladder around it). Used when the borrower has already fixed the rate.
BY_PRICE: pin a target price (par or a specific premium). For each eligible investor, the engine returns the rate ladder around that price. Used in net-cost-driven shopping conversations.

Brokers tend to live in BY_PRICE; correspondents in BEST_EX; lock desks in BY_RATE. The mode is just a filter on top of the same ladder.

The trace

The single most consequential design decision in a pricing engine is whether the engine emits a per-rule trace as a side-effect of doing the math, or whether the trace is a separate "explain" pass that can drift from the regular pricing pass.

A trace, done correctly, is an ordered list of trace lines, one per rule that fired. Each line carries the rule's id, its human-readable description, the condition that matched, the combine strategy used, the value contributed, and the running cumulative price after the rule fires. The line is built in lockstep with price computation; there is no separate code path for "explain mode."

The downstream payoff is enormous. Compliance reads the trace verbatim. LOs use it to debug surprising results. Auditors stop asking "why" because the answer is in the response payload. The engine performance cost is a few percent — well worth it.

Historical replay

Every quote should be reproducible. If a regulator asks how a loan priced on a specific date 18 months ago, the engine should return the same numbers it returned then — using the ratesheet that was active on that date, the comp/margin rules effective on that date, and the program rules applicable on that date.

This works only if the underlying data is versioned. Ratesheets must be immutable per version; comp and margin rules must carry effective ranges; program eligibility must be timestamped. None of this is exotic — it's the same posture you'd apply to any financial-system-of-record. But most pricing engines were not built this way and cannot replay.

Caching

Latency in a pricing engine is dominated by ratesheet I/O. A serious architecture uses a three-tier cache:

A local in-process cache (Caffeine or equivalent) with a short TTL (~60 seconds) for the hot path.
A shared Redis cluster cache for the warm path across replicas.
The MySQL source of truth for the cold path.

Critically, cache invalidation must be event-driven, not TTL-driven. When a ratesheet is activated, an event fires; every replica drops the relevant cache entry; subsequent reads see the new ratesheet within milliseconds. TTL-driven invalidation means a stale ratesheet can serve for the cache duration after a rollback — unacceptable.

Failure modes

Real engines fail in predictable ways. Watch for these:

Predicate mismatch on custom fields. A rule that checks borrowers[0].someCustomField when the field is missing should fail closed (drop the program) and log a warning, not silently pass.
Cache invalidation drift. A bug that fails to invalidate all replicas means some priced quotes use stale data. Detect by cross-replica price comparison on a sample loan, scheduled.
Margin order dependency. SUM rules are commutative; OVERRIDE rules are not. Subtle reordering of the margin stack can shift prices. Lock the order in configuration; assert on it in tests.
Ratesheet activation race. Two ratesheet versions briefly active during a transition. Solve with optimistic locking and a single-active invariant in the database.
Trace divergence. The trace says one thing; the returned price reflects something else. Almost always caused by having a separate "explain" code path. Don't.

What to do with this

If you're evaluating a pricing engine, this pillar is your checklist. Ask the vendor to demonstrate each of the items above on their own data, then on yours. The good ones welcome it.

Cluster

Questions readers send us.

How does loan-level pricing differ from bid-tape pricing?

Loan-level pricing is per-loan, executed at borrower interaction. Bid-tape pricing is per-pool, executed in the secondary market when an originator commits a batch of closed loans to an investor. They use overlapping rule sets but operate at different times with different inputs. RateStack supports both; sell-side pricing is the bid-tape-equivalent path.

Are GSE LLPAs hardcoded?

No, and they shouldn't be. LLPAs are configuration — rows in MySQL with effective dates. When Fannie or Freddie updates an LLPA grid, an operator updates the configuration; the engine starts using the new grid on the effective date. No deploy required.

What's the realistic latency target?

Sub-200ms p95 on the warm path is achievable with a three-tier cache and a stateless engine. Cold-cache cold-ratesheet runs land around 800ms p99. Warm-path consistency depends on how aggressively your invalidation is event-driven; TTL-only invalidation will cost you here.

Do I need a different engine for non-QM products?

No. Non-QM is a configuration of programs, predicates, and overlays — the same engine. Some programs need additional fields (asset depletion, bank-statement income), but those are custom borrower-facet fields, not engine code.

How do I audit the engine itself?

Three layers: (1) the trace on every quote answers 'how was this priced'; (2) the audit hash chain answers 'has any pricing record been mutated post-write'; (3) historical replay answers 'reproduce this quote as of date X'. If the engine you're evaluating can't do all three, the audit story is incomplete.

Other pillars

Adjacent topic guides.

Ratesheet automation

Email-in, portal scrape, OCR, and learning header-mapping templates — what it takes to make ratesheet ingestion a non-event.

Rate lock management

Lifecycle, sell-side, lock-desk policy, and the operational discipline that keeps lock-day surprises rare.

Compliance & audit

ECOA, HMDA, TRID, and the audit-chain disciplines that make compliance a query rather than a quarterly fire drill.

Secondary marketing

Bid-tape execution, hedging inputs, pull-through analysis, and the event-stream architecture that ties them together.

Ready to see it on your data?

See loan-level pricing, end to end in production.

Spin up a sandbox or talk to us about a guided demo. Everything in this guide is wired into the platform — not aspirational.

Request a demo Or start in the sandbox