CrossEncoder based on BAAI/bge-reranker-v2-m3

This is a Cross Encoder model finetuned from BAAI/bge-reranker-v2-m3 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

  • Model Type: Cross Encoder
  • Base model: BAAI/bge-reranker-v2-m3
  • Maximum Sequence Length: 1024 tokens
  • Number of Output Labels: 1 label

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
    ["Who is the mother of Cyril Holland's father?", 'Cecilia of Normandy. Cecilia of Normandy (or Cecily; c. 1056 – 30 July 1126) is thought to be the eldest daughter of William the Conqueror and Matilda of Flanders. Her brothers were kings William II and Henry I of England. She was very close to her other brother, Robert Curthose, and was educated by the abbess Matilda.'],
    ['Did the the agency David Rossi works for keep files on Elvis Presley?', '13 Hours: The Secret Soldiers of Benghazi. In 2012, Benghazi, Libya is named one of the most dangerous places in the world, and countries have pulled their diplomatic offices out of the country in fear of an attack by militants. The United States, however, still has a diplomatic compound (not an official consulate) open in the city. Less than a mile away is a CIA outpost called "The Annex", which is protected by a team of private military contractors from Global Response Staff (GRS). New to the detail is Jack Silva, who arrives in Benghazi and is picked up by Tyrone "Rone" Woods, commander of the GRS team and a personal friend of Silva. Arriving at the Annex, Silva is introduced to the rest of the GRS team and the CIA Chief of Station, who constantly gives the team strict reminders to never engage the citizens.'],
    ['How many murders were there in 2015 in the city that is the capital of the state where Wellesley College is in Mona Lisa Smile?', 'Boston. In addition to city government, numerous commissions and state authorities—including the Massachusetts Department of Conservation and Recreation, the Boston Public Health Commission, the Massachusetts Water Resources Authority (MWRA), and the Massachusetts Port Authority (Massport)—play a role in the life of Bostonians. As the capital of Massachusetts, Boston plays a major role in state politics.'],
    ['In which year did the company that made SS.11 end?', 'NUMMI. New United Motor Manufacturing, Inc. (NUMMI) was an automobile manufacturing company in Fremont, California, jointly owned by General Motors and Toyota that opened in 1984 and closed in 2010. On October 27, 2010, its former plant reopened as a 100% Tesla Motors-owned production facility, known as the Tesla Factory. The plant is located in the East Industrial area of Fremont between Interstate 880 and Interstate 680.'],
    ["What did Goa's country launch to send the mangalyaan to the planet where Arsia Chasmata was found?", 'Arsia Chasmata. Arsia Chasmata is a steep-sided depression located northeast of Arsia Mons in the Phoenicis Lacus quadrangle on Mars, located at 7.6° S and 119.3° W. It is 97\xa0km long and was named after an albedo name.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    "Who is the mother of Cyril Holland's father?",
    [
        'Cecilia of Normandy. Cecilia of Normandy (or Cecily; c. 1056 – 30 July 1126) is thought to be the eldest daughter of William the Conqueror and Matilda of Flanders. Her brothers were kings William II and Henry I of England. She was very close to her other brother, Robert Curthose, and was educated by the abbess Matilda.',
        '13 Hours: The Secret Soldiers of Benghazi. In 2012, Benghazi, Libya is named one of the most dangerous places in the world, and countries have pulled their diplomatic offices out of the country in fear of an attack by militants. The United States, however, still has a diplomatic compound (not an official consulate) open in the city. Less than a mile away is a CIA outpost called "The Annex", which is protected by a team of private military contractors from Global Response Staff (GRS). New to the detail is Jack Silva, who arrives in Benghazi and is picked up by Tyrone "Rone" Woods, commander of the GRS team and a personal friend of Silva. Arriving at the Annex, Silva is introduced to the rest of the GRS team and the CIA Chief of Station, who constantly gives the team strict reminders to never engage the citizens.',
        'Boston. In addition to city government, numerous commissions and state authorities—including the Massachusetts Department of Conservation and Recreation, the Boston Public Health Commission, the Massachusetts Water Resources Authority (MWRA), and the Massachusetts Port Authority (Massport)—play a role in the life of Bostonians. As the capital of Massachusetts, Boston plays a major role in state politics.',
        'NUMMI. New United Motor Manufacturing, Inc. (NUMMI) was an automobile manufacturing company in Fremont, California, jointly owned by General Motors and Toyota that opened in 1984 and closed in 2010. On October 27, 2010, its former plant reopened as a 100% Tesla Motors-owned production facility, known as the Tesla Factory. The plant is located in the East Industrial area of Fremont between Interstate 880 and Interstate 680.',
        'Arsia Chasmata. Arsia Chasmata is a steep-sided depression located northeast of Arsia Mons in the Phoenicis Lacus quadrangle on Mars, located at 7.6° S and 119.3° W. It is 97\xa0km long and was named after an albedo name.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Binary Classification

Metric validation train_subset
accuracy 0.9244 0.798
accuracy_threshold 0.0406 0.2206
f1 0.9207 0.7837
f1_threshold 0.0406 0.0406
precision 0.9679 0.7186
recall 0.8779 0.8618
average_precision 0.9717 0.8546

Training Details

Training Dataset

Unnamed Dataset

  • Size: 9,366 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 31 characters
    • mean: 97.76 characters
    • max: 242 characters
    • min: 125 characters
    • mean: 591.39 characters
    • max: 1906 characters
    • min: 0.0
    • mean: 0.53
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    Who is the mother of Cyril Holland's father? Cecilia of Normandy. Cecilia of Normandy (or Cecily; c. 1056 – 30 July 1126) is thought to be the eldest daughter of William the Conqueror and Matilda of Flanders. Her brothers were kings William II and Henry I of England. She was very close to her other brother, Robert Curthose, and was educated by the abbess Matilda. 0.0
    Did the the agency David Rossi works for keep files on Elvis Presley? 13 Hours: The Secret Soldiers of Benghazi. In 2012, Benghazi, Libya is named one of the most dangerous places in the world, and countries have pulled their diplomatic offices out of the country in fear of an attack by militants. The United States, however, still has a diplomatic compound (not an official consulate) open in the city. Less than a mile away is a CIA outpost called "The Annex", which is protected by a team of private military contractors from Global Response Staff (GRS). New to the detail is Jack Silva, who arrives in Benghazi and is picked up by Tyrone "Rone" Woods, commander of the GRS team and a personal friend of Silva. Arriving at the Annex, Silva is introduced to the rest of the GRS team and the CIA Chief of Station, who constantly gives the team strict reminders to never engage the citizens. 0.0
    How many murders were there in 2015 in the city that is the capital of the state where Wellesley College is in Mona Lisa Smile? Boston. In addition to city government, numerous commissions and state authorities—including the Massachusetts Department of Conservation and Recreation, the Boston Public Health Commission, the Massachusetts Water Resources Authority (MWRA), and the Massachusetts Port Authority (Massport)—play a role in the life of Bostonians. As the capital of Massachusetts, Boston plays a major role in state politics. 1.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step validation_average_precision train_subset_average_precision
0.0214 100 0.9844 0.8233
0.0427 200 0.9819 0.8052
0.0641 300 0.9728 0.8689
0.0854 400 0.9717 0.8546

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 5.2.0
  • Transformers: 4.44.2
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
43
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OloriBern/musique-bge-m3-2000

Finetuned
(53)
this model

Paper for OloriBern/musique-bge-m3-2000

Evaluation results