What is a Recommendation System?
A recommendation system is a machine learning system that predicts which items a user is most likely to find relevant or useful, and surfaces them proactively. It learns from user behaviour (clicks, purchases, ratings) and item characteristics to personalise suggestions. The three main approaches are collaborative filtering, content-based filtering, and hybrid systems that combine both.
Why do recommendation systems matter?
Recommendation systems drive a significant share of engagement and revenue in consumer products. Netflix estimates ~80% of content watched comes from recommendations, and Amazon attributes ~35% of revenue to them. They are also increasingly used in enterprise contexts: surfacing relevant documents, prioritising support tickets, and recommending knowledge base articles.
The core technical challenge, finding semantically similar items and personalising to user history, overlaps heavily with semantic search and RAG infrastructure.
What are the main types of recommendation systems?
Collaborative filtering
Recommends items based on what similar users liked, without using item content:
- User-based CF: find users with similar history to the target user, recommend what they liked
- Item-based CF: find items with similar interaction patterns, recommend similar ones
- Matrix factorisation: decompose the user-item interaction matrix into latent factor vectors (SVD, ALS, BPR)
Strengths: captures taste patterns beyond content. Weaknesses: cold start problem (new users/items), sparse interaction data.
Content-based filtering
Recommends items similar to ones the user has previously engaged with, based on item features:
- Compare item embeddings (text descriptions, images, metadata)
- Return items whose vectors are closest to the user’s interaction history
Strengths: works for new items (no interaction data needed). Weaknesses: limited to “more of the same”; doesn’t discover new tastes.
Hybrid systems
Most production recommendation systems combine both:
final_score = α × collaborative_score + (1-α) × content_scoreNeural approaches (Two-Tower models, DLRM) learn to combine signals end-to-end.
How do embeddings power modern recommendation?
The modern approach represents both users and items as dense vectors in a shared embedding space. Recommendation becomes a nearest-neighbour search:
- Encode item content (text, images, metadata) into item embeddings
- Build user embeddings from their interaction history (average of interacted item embeddings, or a learned aggregation)
- Retrieve nearest item vectors to the user’s vector via ANN search
import numpy as npfrom sie_sdk import SIEClientfrom sie_sdk.types import Item
client = SIEClient("http://localhost:8080")
# Encode item descriptionsitem_embeddings = np.stack( [r["dense"] for r in client.encode("BAAI/bge-m3", [Item(text=d) for d in item_descriptions])])
# User embedding = average of their interaction historyuser_embedding = item_embeddings[user_interactions].mean(axis=0)
# ANN search for nearest itemsrecommendations = vector_db.search(user_embedding, top_k=20)SIE provides the encoding step; a vector database (Qdrant, Weaviate) handles the retrieval.
The cold start problem
Cold start occurs when there’s insufficient interaction data to make good recommendations:
| Scenario | Type | Solution |
|---|---|---|
| New user, no history | User cold start | Content-based from onboarding, popular items |
| New item, no interactions | Item cold start | Content embeddings, metadata-based |
| New system | System cold start | Content-based until data accumulates |
Content embeddings (from SIE) are the standard solution to item cold start. You can recommend new items based on their semantic similarity to items the user has engaged with.
Evaluation metrics for recommendation systems
| Metric | What it measures |
|---|---|
| Precision@K | Of top-K recommendations, fraction that are relevant |
| Recall@K | Of all relevant items, fraction retrieved in top-K |
| NDCG@K | Ranking quality: relevant items ranked higher score better |
| Hit Rate | Whether the relevant item appears in top-K at all |
| MRR | Mean Reciprocal Rank: how high the first relevant result is |
Recommendation systems vs semantic search
| Semantic search | Recommendation | |
|---|---|---|
| Input | Explicit query | Implicit behaviour / history |
| Personalisation | Usually none | Core goal |
| Cold start | Less of a problem | Major challenge |
| Infrastructure | Embedding model + vector DB | Same + user model |
The infrastructure overlaps significantly. Systems built on SIE for semantic search can be extended to support recommendation by adding a user history aggregation layer.
Frequently asked questions
What is a Two-Tower model? A Two-Tower (or dual-encoder) model uses separate encoders for users and items, training them end-to-end with contrastive loss so that relevant user-item pairs are close in embedding space. This is essentially the same architecture as bi-encoder text embedding models.
How is recommendation different from personalised search? Personalised search takes an explicit query and re-ranks results based on user history. Recommendation has no explicit query; it surfaces items proactively based on predicted interest.
What role does a reranker play in recommendation? After fast ANN retrieval of candidates, a reranker (cross-encoder or gradient boosted model) can score each candidate more precisely using richer features, the same pattern as in search pipelines.