Fusion model | Knowledge²

Training

Outcome-aware fusion learns your domain

Dense and sparse candidates are merged with rich features—scores, rank deltas, overlap, metadata—and trained against historical satisfaction signals.

Captures when to trust semantic breadth versus lexical grounding.
Learns directly from dense and sparse success + failure cases.

Shortlist quality

Calibrated top-k for the reranker

The fusion model compresses the candidate set without sacrificing recall, keeping latency and downstream token usage lean.

Adaptive blending balances breadth and precision per query.
Top-k volumes stay compact so reranker passes remain fast.

Operations

Guardrail-ready scores and telemetry

Fusion outputs arrive calibrated, with score bands that plug straight into guardrails, dashboards, and feedback loops.

Confidence thresholds surface when to expand or defer.
Integrates with observability to spot drift before it ships.

Fusion-aware ordering

Intelligently blends dense and sparse candidates instead of relying on fixed heuristics.

Fine-tuned on your data

Learns from historical dense and sparse outcomes so every blend reflects your domain.

Evidence-based scoring

Ranks by learned relevance, not raw similarity, giving the reranker a stronger shortlist.

Calibrated confidence

Produces scores with clear thresholds so downstream systems know when to trust the set.

How it works

We replace Reciprocal Rank Fusion (RRF)—a fixed, query-agnostic heuristic—with a learned model that produces a calibrated fused score before reranking. Concretely, for a query q and candidate d, we learn:

s_fuse(q, d) = g([s_dense(q, d), s_sparse(q, d), Δrank, lex_overlap, meta, φ(q)])

Inputs to g include:

Dense score `s_dense(q, d)`
Sparse score `s_sparse(q, d)`
Rank deltas `Δrank` and reciprocal features
Lexical overlap and metadata features
Query representation φ(q) capturing intent and difficulty

Training optimizes pairwise or listwise losses (e.g., NDCG, MRR) so the fusion score mirrors real customer satisfaction outcomes while keeping downstream latency and token budgets in check.

Learned fusion that understands dense + sparse trade-offs