mixedbread-ai/mxbai-edge-colbert-v0-32m
The crispy, lightweight ColBERT family from Mixedbread.
Benchmarks
CQADupstackPhysicsRetrieval
Duplicate question retrieval from StackExchange Physics
Corpus: 38,314 Queries: 1,039
Performance L4 b1 c16
Corpus 28.3K tok/s
Corpus p50 60.5ms
Query 3.0K tok/s
Query p50 55.7ms
CosQA
Code search with natural language queries
Corpus: 6,267 Queries: 500
Performance L4 b1 c16
Corpus 15.0K tok/s
Corpus p50 51.8ms
Query 1.9K tok/s
Query p50 48.5ms
FiQA2018
Financial opinion mining and question answering
Corpus: 57,599 Queries: 648
Performance L4 b1 c16
Corpus 38.9K tok/s
Corpus p50 57.5ms
Query 3.7K tok/s
Query p50 48.0ms
LegalBenchConsumerContractsQA
Question answering on consumer contracts
Corpus: 153 Queries: 396
Performance L4 b1 c16
Corpus 87.3K tok/s
Corpus p50 76.5ms
Query 5.4K tok/s
Query p50 48.3ms
NFCorpus
Biomedical literature search from NutritionFacts.org
Corpus: 3,593 Queries: 323
Quality
ndcg at 10 0.3376
map at 10 0.1285
mrr at 10 0.5432
Performance L4 b1 c16
Corpus 67.7K tok/s
Corpus p50 58.6ms
Query 1.4K tok/s
Query p50 53.0ms
SCIDOCS
Citation prediction, document classification, and recommendation for scientific papers
Corpus: 25,656 Queries: 1,000
Performance L4 b1 c16
Corpus 36.4K tok/s
Corpus p50 70.2ms
Query 2.7K tok/s
Query p50 61.8ms
SciFact
Scientific claim verification using research literature
Corpus: 5,183 Queries: 300
Performance L4 b1 c16
Corpus 58.8K tok/s
Corpus p50 61.2ms
Query 5.4K tok/s
Query p50 49.0ms
StackOverflowQA
Programming question answering from Stack Overflow
Corpus: 19,931 Queries: 1,994
Performance L4 b1 c16
Corpus 52.8K tok/s
Corpus p50 58.9ms
Query 68.3K tok/s
Query p50 66.6ms