---
title: What is a Reranker?
description: A reranker is a model that takes a query and a set of candidate results from an initial retrieval step, and re-scores them for relevance. Unlike embedding models that encode query and documents independently, rerankers compare them jointly, producing more accurate relevance scores at the cost of higher latency.
canonical_url: https://superlinked.com/glossary/what-is-a-reranker
last_updated: 2026-06-11
---

# What is a Reranker?

A reranker is a model that takes a query and a set of candidate results from an initial retrieval step, and re-scores them for relevance. Unlike embedding models that encode query and documents independently, rerankers compare them jointly, producing more accurate relevance scores at the cost of higher latency.

---

## Why does reranking matter?

First-stage retrieval (semantic or keyword search) is optimised for speed and scale. It retrieves the top-k results from millions of documents in milliseconds. But speed comes at a cost: embedding-based retrieval encodes the query and documents separately, so it can miss subtle relevance signals.

A reranker fixes this. It sees the query and each candidate document together, allowing it to reason about their relationship directly. This two-stage approach (retrieve broadly, then rerank precisely) is the standard pattern in production search and RAG systems.

---

## How does a reranker work?

Rerankers are typically cross-encoder models. Given a (query, document) pair, the model outputs a single relevance score:

1. **Retrieve**: use a fast vector search to get top-100 candidates
2. **Rerank**: pass each (query, candidate) pair through the cross-encoder
3. **Return**: serve the top-k reranked results

```python
from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")

# First-stage: fast embedding retrieval
candidates = vector_db.search(query_vector, top_k=100)
candidate_texts = [c.text for c in candidates]

# Second-stage: rerank for precision
score_result = client.score(
    "BAAI/bge-reranker-v2-m3",
    Item(text="indemnification clause"),
    [Item(id=str(i), text=t) for i, t in enumerate(candidate_texts)],
)
id_to_candidate = {str(i): c for i, c in enumerate(candidates)}
results = [id_to_candidate[e["item_id"]] for e in score_result["scores"][:10]]
```

---

## When should you use a reranker?

A reranker is worth adding when:

- **Precision matters more than raw speed**: e.g. legal search, medical RAG, customer support
- **Your retrieval recall is good but top results are noisy**: reranking cleans up the final ordering
- **You're building a RAG pipeline**: the documents fed to an LLM must be highly relevant; irrelevant context degrades answer quality
- **Query complexity is high**: long, nuanced queries benefit most from joint query-document scoring

You generally don't need a reranker for simple lookup tasks or when latency budgets are very tight.

---

## Reranker vs embedding model: key differences

| | Embedding model | Reranker |
|---|---|---|
| Architecture | Bi-encoder | Cross-encoder |
| Encodes | Query and docs independently | Query + doc jointly |
| Speed | Fast (pre-compute doc vectors) | Slower (runtime per pair) |
| Accuracy | Good | Higher |
| Use in pipeline | First-stage retrieval | Second-stage reranking |

---

## Which reranker models does SIE support?

SIE supports leading open-source rerankers including:

- **BGE-Reranker-v2-M3**: multilingual, strong general-purpose performance
- **BGE-Reranker-v2-gemma**: higher accuracy for complex queries
- **Jina Reranker v2**: lightweight, fast

All models run in your own AWS or GCP environment, with no data sent to external APIs. You can hot-swap models without downtime.

---

## Frequently asked questions

**Does using a reranker significantly increase latency?**
Reranking 100 candidates typically adds 50-200ms depending on model size and hardware. For most search and RAG applications this is acceptable given the accuracy gains. SIE's GPU batching minimises this overhead.

**Can I use a reranker without a vector database?**
Yes. You can rerank any list of documents, including keyword search results from Elasticsearch or BM25.

**Do I need to fine-tune a reranker for my domain?**
Out-of-the-box rerankers perform well for general queries. For specialised domains (legal, medical, code), fine-tuned or LoRA-adapted rerankers improve significantly. SIE supports LoRA hot-loading for this purpose.

---

## Related resources

- [Browse reranker models on SIE](/models)
- [Regulatory Intelligence RAG example](/docs/examples/regulatory-intelligence-rag)
- [What is semantic search?](/glossary/what-is-semantic-search)
- [What is RAG?](/glossary/what-is-rag)
- [What is a LoRA adapter?](/glossary/what-is-a-lora-adapter)
