Why did we open-source our inference engine? Read the post

answerdotai/answerai-colbert-small-v1 (Score)

answerai-colbert-small-v1 is a new, proof-of-concept model by Answer.AI, showing the strong performance multi-vector models with the new JaColBERTv2.5 training recipe and some extra tweaks can reach, even with just 33 million parameters.

Architecture
BERT
Parameters
33M
Tasks
Encode
Outputs
Multi-Vec
Dimensions
Multi-Vec: 96
Max Sequence Length
512 tokens
License
apache-2.0
Languages
en

Benchmarks

CQADupstackPhysicsRetrieval

scientific retrieval en

Duplicate question retrieval from StackExchange Physics

Corpus: 38,314 Queries: 1,039
Performance L4-SPOT b1 c16
Corpus 3.9K tok/s
Corpus p50 203.2ms
Query 186 tok/s
Query p50 300.7ms
Performance L4 b1 c16
Corpus 37.3K tok/s
Corpus p50 54.7ms
Query 3.6K tok/s
Query p50 47.1ms
Reference →

CosQA

technology retrieval en

Code search with natural language queries

Corpus: 6,267 Queries: 500
Performance L4-SPOT b1 c16
Corpus 1.1K tok/s
Corpus p50 345.4ms
Query 102 tok/s
Query p50 466.2ms
Performance L4 b1 c16
Corpus 15.7K tok/s
Corpus p50 53.9ms
Query 2.0K tok/s
Query p50 47.4ms
Reference →

FiQA2018

finance retrieval en

Financial opinion mining and question answering

Corpus: 57,599 Queries: 648
Performance L4-SPOT b1 c16
Corpus 3.4K tok/s
Corpus p50 384.8ms
Query 174 tok/s
Query p50 547.3ms
Performance L4 b1 c16
Corpus 43.1K tok/s
Corpus p50 59.1ms
Query 3.7K tok/s
Query p50 50.0ms
Reference →

LegalBenchConsumerContractsQA

legal retrieval en

Question answering on consumer contracts

Corpus: 153 Queries: 396
Performance L4-SPOT b1 c16
Corpus 11.2K tok/s
Corpus p50 286.1ms
Query 254 tok/s
Query p50 500.2ms
Performance L4 b1 c16
Corpus 83.1K tok/s
Corpus p50 83.9ms
Query 4.8K tok/s
Query p50 52.2ms
Reference →

NFCorpus

medical retrieval en

Biomedical literature search from NutritionFacts.org

Corpus: 3,593 Queries: 323
Performance L4-SPOT b1 c16
Corpus 5.7K tok/s
Corpus p50 300.1ms
Query 210 tok/s
Query p50 178.7ms
Performance L4 b1 c16
Corpus 60.7K tok/s
Corpus p50 69.4ms
Query 1.4K tok/s
Query p50 52.3ms
Reference →

SCIDOCS

scientific retrieval en

Citation prediction, document classification, and recommendation for scientific papers

Corpus: 25,656 Queries: 1,000
Performance L4-SPOT b1 c16
Corpus 2.9K tok/s
Corpus p50 501.5ms
Query 222 tok/s
Query p50 353.0ms
Performance L4 b1 c16
Corpus 48.4K tok/s
Corpus p50 57.9ms
Query 3.8K tok/s
Query p50 46.7ms
Reference →

SciFact

scientific retrieval en

Scientific claim verification using research literature

Corpus: 5,183 Queries: 300
Performance L4-SPOT b1 c16
Corpus 4.9K tok/s
Corpus p50 443.1ms
Query 332 tok/s
Query p50 430.3ms
Performance L4 b1 c16
Corpus 61.5K tok/s
Corpus p50 63.3ms
Query 5.2K tok/s
Query p50 50.5ms
Reference →

StackOverflowQA

technology retrieval en

Programming question answering from Stack Overflow

Corpus: 19,931 Queries: 1,994
Performance L4-SPOT b1 c16
Corpus 4.4K tok/s
Corpus p50 379.7ms
Query 5.2K tok/s
Query p50 432.3ms
Performance L4 b1 c16
Corpus 55.3K tok/s
Corpus p50 60.2ms
Query 73.2K tok/s
Query p50 64.7ms
Reference →

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github 1.5K

Contact us

Tell us about your use case and we'll get back to you shortly.