vidore/colpali-v1.3-hf

> [!IMPORTANT] > This version of ColPali should be loaded with the `transformers 🤗` release, not with `colpali-engine`. > It was converted using the `convert_colpali_weights_to_hf.py` script > from the `vidore/colpali-v1.3-merged` checkpoint.

Architecture

PaliGemma

Parameters

3.0B

Tasks

Encode

Outputs

Multi-Vec

Dimensions

Multi-Vec: 128

Max Sequence Length

2,048 tokens

License

gemma

Languages

View on HuggingFace → Fine-tuned from vidore/colpaligemma-3b-pt-448-base

Benchmarks

Vidore3ComputerScienceRetrieval

technology retrieval en

Visual document retrieval on computer science papers and slides

Performance L4 b1 c16

Corpus 23.2 mpix/s

Corpus p50 579.6ms

Query 484 tok/s

Query p50 266.9ms

Reference →

Vidore3FinanceEnRetrieval

finance retrieval en

Visual document retrieval on financial reports

Performance L4 b1 c16

Corpus 22.8 mpix/s

Corpus p50 583.7ms

Query 469 tok/s

Query p50 252.6ms

Reference →

Vidore3HrRetrieval

general retrieval en

Visual document retrieval on HR-related documents

Performance L4 b1 c16

Corpus 23.5 mpix/s

Corpus p50 585.1ms

Query 562 tok/s

Query p50 261.5ms

Reference →

Vidore3PharmaceuticalsRetrieval

medical retrieval en

Visual document retrieval on pharmaceutical documents

Performance L4 b1 c16

Corpus 16.3 mpix/s

Corpus p50 575.6ms

Query 538 tok/s

Query p50 250.7ms

Reference →

Benchmarks

Vidore3ComputerScienceRetrieval

Vidore3FinanceEnRetrieval

Vidore3HrRetrieval

Vidore3PharmaceuticalsRetrieval

Self-hosted inference for search & document processing