Why did we open-source our inference engine? Read the post
← All Glossary Articles

What is Classification in Machine Learning?

Classification is a supervised learning task where a model learns to assign inputs to one of a fixed set of categories. Binary classification predicts one of two outcomes (e.g. spam or not spam); multi-class classification predicts one of three or more categories (e.g. document type). The model learns from labelled examples and generalises to unseen inputs.


Why does classification matter?

Classification is one of the most common tasks in applied machine learning. It underlies document routing, intent detection, content moderation, medical diagnosis, fraud detection, and many RAG pre-processing pipelines where documents must be labelled or filtered before indexing.


How does binary classification work?

Binary classification produces a score between 0 and 1 using a sigmoid activation on the output:

σ(x) = 1 / (1 + e^(-x))

A threshold (typically 0.5) converts the score to a class label. The model is trained by minimising binary cross-entropy loss:

Loss = -[y·log(p) + (1-y)·log(1-p)]

Where y is the true label (0 or 1) and p is the predicted probability.


How does multi-class classification work?

Multi-class classification outputs a probability distribution over all classes using a softmax activation:

softmax(xᵢ) = e^xᵢ / Σ e^xⱼ

All probabilities sum to 1. The model is trained with categorical cross-entropy loss.


What metrics should you use to evaluate a classifier?

Raw accuracy is misleading on imbalanced datasets. Use these metrics together:

MetricWhat it measuresWhen to prioritise
AccuracyOverall correctnessBalanced classes
PrecisionOf predicted positives, how many are correctWhen false positives are costly
RecallOf actual positives, how many were foundWhen false negatives are costly
F1 ScoreHarmonic mean of precision and recallImbalanced datasets
AUC-ROCDiscrimination ability across thresholdsComparing classifiers

For document classification in search pipelines, F1 and precision are usually the most important metrics.


How does classification relate to semantic search and RAG?

Classification often appears as a pre- or post-processing step in inference pipelines:

  • Query intent classification: route queries to different retrieval strategies based on detected intent
  • Document type classification: tag documents before indexing so they can be filtered at retrieval time
  • Answer relevance classification: judge whether a retrieved chunk is relevant before passing it to an LLM
  • Reranker output: cross-encoder rerankers can be framed as binary classifiers predicting (query, document) relevance

Embedding models hosted on SIE can be fine-tuned for classification tasks using a classification head on top of the encoder.


Binary vs multi-class vs multi-label classification

TypeOutputExample
BinaryOne of two classesSpam / not spam
Multi-classOne of N classesDocument category
Multi-labelMultiple classes simultaneouslyDocument topics

Multi-label classification is common for document tagging in knowledge bases and RAG pipelines.


Frequently asked questions

What’s the difference between classification and regression? Classification predicts a discrete category. Regression predicts a continuous value. The model architectures are similar, but the output layer and loss function differ.

Can I use an embedding model for classification? Yes. A common pattern is to encode text with an embedding model (e.g. BGE-M3 via SIE) and then train a lightweight classification head on top of the frozen embeddings, far more efficient than fine-tuning the full model.

What is class imbalance and how do you handle it? Class imbalance occurs when one class has far more examples than others. Techniques include oversampling the minority class (SMOTE), undersampling the majority, and using class-weighted loss functions.


Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github 2.0K

Contact us

Tell us about your use case and we'll get back to you shortly.