SentenceTransformer based on reasonir/ReasonIR-8B

This is a sentence-transformers model finetuned from reasonir/ReasonIR-8B. It maps sentences & paragraphs to a 4096-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: reasonir/ReasonIR-8B
Maximum Sequence Length: 131072 tokens
Output Dimensionality: 4096 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 131072, 'do_lower_case': False, 'architecture': 'ReasonIRModel'})
  (1): Pooling({'word_embedding_dimension': 4096, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': False})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("shahafvl/reasonir-8b-scientific-parent-prompt")
# Run inference
sentences = [
    'Running document analytics pipelines can be highly time-consuming, particularly as the underlying corpora expand at a rapid pace. In addition, these workloads typically demand substantial storage capacity and main memory. A widely adopted strategy to alleviate the storage pressure is to apply data compression.',
    'Data-intensive analytics workloads are often both computationally expensive and demanding in terms of storage and memory resources, with costs escalating as datasets grow. One prevalent method for alleviating storage and memory pressure is to store data in compressed form.',
    'Endmember variability in spectral unmixing encompasses both extrinsic factors, such as sensor geometry and illumination, and intrinsic factors related to the physical and chemical properties of the materials. While the former can often be approximated with analytical radiative transfer models, the latter is generally too complex to describe explicitly, motivating the adoption of statistical distributions, mixture models, or stochastic processes to model endmember spectra in high-dimensional spaces.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 4096]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8138, 0.0302],
#         [0.8138, 1.0000, 0.0370],
#         [0.0302, 0.0370, 1.0000]])

Evaluation

Metrics

Information Retrieval

Dataset: ir_parent_grandparent_combined
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	1.0
cosine_accuracy@3	1.0
cosine_accuracy@5	1.0
cosine_accuracy@10	1.0
cosine_precision@1	1.0
cosine_precision@3	1.0
cosine_precision@5	0.987
cosine_precision@10	0.5949
cosine_recall@1	0.1667
cosine_recall@3	0.5001
cosine_recall@5	0.8225
cosine_recall@10	0.9916
cosine_ndcg@10	0.9924
cosine_mrr@10	1.0
cosine_map@100	0.9886

Triplet

Datasets: orig_vs_parent, orig_vs_grandparent, sib_vs_parent and sib_vs_grandparent
Evaluated with TripletEvaluator

Metric	orig_vs_parent	orig_vs_grandparent	sib_vs_parent	sib_vs_grandparent
cosine_accuracy	0.99	0.01	0.9896	0.0104

Training Details

Training Dataset

Unnamed Dataset

Size: 161,986 training samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 41 tokens
mean: 77.92 tokens
max: 120 tokens

min: 43 tokens
mean: 70.07 tokens
max: 108 tokens

	anchor	positive
type	string	string
details	min: 41 tokens mean: 77.92 tokens max: 120 tokens	min: 43 tokens mean: 70.07 tokens max: 108 tokens

Samples:

anchor	positive
`LLZO is stable against Li metal and against a high voltage cathode. However, oxides generally require high sintering temperatures to remove grain boundaries to achieve the reported conductivity values. They also tend to be brittle, which makes it harder (relative to the sulfides) to maintain solid-solid interfacial contact and also to process.`	`Oxide-based solid electrolytes often show excellent electrochemical stability with both lithium metal anodes and high-voltage cathodes, but exploiting their full ionic conductivity generally requires aggressive high-temperature sintering to suppress grain boundary resistance. The resulting brittle ceramic bodies can be challenging to process and to integrate mechanically, particularly when maintaining intimate solid–solid interfacial contact with electrodes is critical.`
`LLZO is stable against Li metal and against a high voltage cathode. However, oxides generally require high sintering temperatures to remove grain boundaries to achieve the reported conductivity values. They also tend to be brittle, which makes it harder (relative to the sulfides) to maintain solid-solid interfacial contact and also to process.`	`Oxide-based solid electrolytes often show excellent electrochemical stability with both lithium metal anodes and high-voltage cathodes, but exploiting their full ionic conductivity generally requires aggressive high-temperature sintering to suppress grain boundary resistance. The resulting brittle ceramic bodies can be challenging to process and to integrate mechanically, particularly when maintaining intimate solid–solid interfacial contact with electrodes is critical.`
`LLZO is stable against Li metal and against a high voltage cathode. However, oxides generally require high sintering temperatures to remove grain boundaries to achieve the reported conductivity values. They also tend to be brittle, which makes it harder (relative to the sulfides) to maintain solid-solid interfacial contact and also to process.`	`Oxide-based solid electrolytes often show excellent electrochemical stability with both lithium metal anodes and high-voltage cathodes, but exploiting their full ionic conductivity generally requires aggressive high-temperature sintering to suppress grain boundary resistance. The resulting brittle ceramic bodies can be challenging to process and to integrate mechanically, particularly when maintaining intimate solid–solid interfacial contact with electrodes is critical.`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
learning_rate: 1e-05
num_train_epochs: 1
warmup_ratio: 0.05
bf16: True
disable_tqdm: True
load_best_model_at_end: True
push_to_hub: True
hub_model_id: shahafvl/reasonir-8b-scientific-parent-prompt
hub_private_repo: False
auto_find_batch_size: True
prompts: {'anchor': 'Retrieve the broader scientific generalization or context for the given specific text.\nQuery: '}

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 8
per_device_eval_batch_size: 8
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 1e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.05
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: True
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: True
resume_from_checkpoint: None
hub_model_id: shahafvl/reasonir-8b-scientific-parent-prompt
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: True
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
prompts: {'anchor': 'Retrieve the broader scientific generalization or context for the given specific text.\nQuery: '}
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Click to expand

Epoch	Step	Training Loss	ir_parent_grandparent_combined_cosine_ndcg@10	orig_vs_parent_cosine_accuracy	orig_vs_grandparent_cosine_accuracy	sib_vs_parent_cosine_accuracy	sib_vs_grandparent_cosine_accuracy
0.0049	100	0.0293	-	-	-	-	-
0.0099	200	0.003	-	-	-	-	-
0.0148	300	0.0036	-	-	-	-	-
0.0198	400	0.0006	-	-	-	-	-
0.0247	500	0.0004	-	-	-	-	-
0.0296	600	0.0004	-	-	-	-	-
0.0346	700	0.0022	-	-	-	-	-
0.0395	800	0.0002	-	-	-	-	-
0.0444	900	0.0001	-	-	-	-	-
0.0494	1000	0.0003	-	-	-	-	-
0.0543	1100	0.0028	-	-	-	-	-
0.0593	1200	0.002	-	-	-	-	-
0.0642	1300	0.0027	-	-	-	-	-
0.0691	1400	0.0003	-	-	-	-	-
0.0741	1500	0.0021	-	-	-	-	-
0.0790	1600	0.0001	-	-	-	-	-
0.0840	1700	0.0001	-	-	-	-	-
0.0889	1800	0.0018	-	-	-	-	-
0.0938	1900	0.0001	-	-	-	-	-
0.0988	2000	0.0057	-	-	-	-	-
0.1037	2100	0.0028	-	-	-	-	-
0.1086	2200	0.0027	-	-	-	-	-
0.1136	2300	0.0	-	-	-	-	-
0.1185	2400	0.0	-	-	-	-	-
0.1235	2500	0.0001	-	-	-	-	-
0.1284	2600	0.0045	-	-	-	-	-
0.1333	2700	0.0018	-	-	-	-	-
0.1383	2800	0.0037	-	-	-	-	-
0.1432	2900	0.0088	-	-	-	-	-
0.1482	3000	0.0069	-	-	-	-	-
0.1531	3100	0.0027	-	-	-	-	-
0.1580	3200	0.0001	-	-	-	-	-
0.1630	3300	0.0002	-	-	-	-	-
0.1679	3400	0.002	-	-	-	-	-
0.1728	3500	0.0001	-	-	-	-	-
0.1778	3600	0.0023	-	-	-	-	-
0.1827	3700	0.0001	-	-	-	-	-
0.1877	3800	0.0001	-	-	-	-	-
0.1926	3900	0.0001	-	-	-	-	-
0.1975	4000	0.0027	-	-	-	-	-
0.2025	4100	0.0	-	-	-	-	-
0.2074	4200	0.0027	-	-	-	-	-
0.2124	4300	0.0027	-	-	-	-	-
0.2173	4400	0.0027	-	-	-	-	-
0.2222	4500	0.0	-	-	-	-	-
0.2272	4600	0.0001	-	-	-	-	-
0.2321	4700	0.0	-	-	-	-	-
0.2370	4800	0.002	-	-	-	-	-
0.2420	4900	0.0069	-	-	-	-	-
0.2469	5000	0.0	-	-	-	-	-
0.2519	5100	0.0016	-	-	-	-	-
0.2568	5200	0.0001	-	-	-	-	-
0.2617	5300	0.0	-	-	-	-	-
0.2667	5400	0.0022	-	-	-	-	-
0.2716	5500	0.0018	-	-	-	-	-
0.2766	5600	0.0019	-	-	-	-	-
0.2815	5700	0.0	-	-	-	-	-
0.2864	5800	0.0	-	-	-	-	-
0.2914	5900	0.0018	-	-	-	-	-
0.2963	6000	0.0	-	-	-	-	-
0.3012	6100	0.0001	-	-	-	-	-
0.3062	6200	0.0	-	-	-	-	-
0.3111	6300	0.002	-	-	-	-	-
0.3161	6400	0.0021	-	-	-	-	-
0.3210	6500	0.0	-	-	-	-	-
0.3259	6600	0.0032	-	-	-	-	-
0.3309	6700	0.002	-	-	-	-	-
0.3358	6800	0.0018	-	-	-	-	-
0.3408	6900	0.0001	-	-	-	-	-
0.3457	7000	0.0018	-	-	-	-	-
0.3506	7100	0.0	-	-	-	-	-
0.3556	7200	0.0	-	-	-	-	-
0.3605	7300	0.0018	-	-	-	-	-
0.3655	7400	0.0	-	-	-	-	-
0.3704	7500	0.0052	0.9905	0.9900	0.0100	0.9892	0.0108
0.3753	7600	0.0036	-	-	-	-	-
0.3803	7700	0.0022	-	-	-	-	-
0.3852	7800	0.0	-	-	-	-	-
0.3901	7900	0.0	-	-	-	-	-
0.3951	8000	0.0025	-	-	-	-	-
0.4000	8100	0.0021	-	-	-	-	-
0.4050	8200	0.0	-	-	-	-	-
0.4099	8300	0.0	-	-	-	-	-
0.4148	8400	0.0	-	-	-	-	-
0.4198	8500	0.0001	-	-	-	-	-
0.4247	8600	0.0	-	-	-	-	-
0.4297	8700	0.0018	-	-	-	-	-
0.4346	8800	0.0	-	-	-	-	-
0.4395	8900	0.0001	-	-	-	-	-
0.4445	9000	0.0	-	-	-	-	-
0.4494	9100	0.0039	-	-	-	-	-
0.4543	9200	0.0042	-	-	-	-	-
0.4593	9300	0.0019	-	-	-	-	-
0.4642	9400	0.0023	-	-	-	-	-
0.4692	9500	0.0	-	-	-	-	-
0.4741	9600	0.0	-	-	-	-	-
0.4790	9700	0.0019	-	-	-	-	-
0.4840	9800	0.0	-	-	-	-	-
0.4889	9900	0.0019	-	-	-	-	-
0.4939	10000	0.0	-	-	-	-	-
0.4988	10100	0.0048	-	-	-	-	-
0.5037	10200	0.0	-	-	-	-	-
0.5087	10300	0.0018	-	-	-	-	-
0.5136	10400	0.0013	-	-	-	-	-
0.5185	10500	0.0043	-	-	-	-	-
0.5235	10600	0.0001	-	-	-	-	-
0.5284	10700	0.0	-	-	-	-	-
0.5334	10800	0.0	-	-	-	-	-
0.5383	10900	0.0	-	-	-	-	-
0.5432	11000	0.0027	-	-	-	-	-
0.5482	11100	0.0062	-	-	-	-	-
0.5531	11200	0.0001	-	-	-	-	-
0.5581	11300	0.0001	-	-	-	-	-
0.5630	11400	0.0001	-	-	-	-	-
0.5679	11500	0.0048	-	-	-	-	-
0.5729	11600	0.0	-	-	-	-	-
0.5778	11700	0.0012	-	-	-	-	-
0.5827	11800	0.0026	-	-	-	-	-
0.5877	11900	0.0037	-	-	-	-	-
0.5926	12000	0.0	-	-	-	-	-
0.5976	12100	0.0	-	-	-	-	-
0.6025	12200	0.0059	-	-	-	-	-
0.6074	12300	0.0	-	-	-	-	-
0.6124	12400	0.0039	-	-	-	-	-
0.6173	12500	0.003	-	-	-	-	-
0.6223	12600	0.0	-	-	-	-	-
0.6272	12700	0.0	-	-	-	-	-
0.6321	12800	0.0	-	-	-	-	-
0.6371	12900	0.0036	-	-	-	-	-
0.6420	13000	0.0071	-	-	-	-	-
0.6469	13100	0.0	-	-	-	-	-
0.6519	13200	0.0	-	-	-	-	-
0.6568	13300	0.0	-	-	-	-	-
0.6618	13400	0.0	-	-	-	-	-
0.6667	13500	0.0033	-	-	-	-	-
0.6716	13600	0.0	-	-	-	-	-
0.6766	13700	0.0	-	-	-	-	-
0.6815	13800	0.0001	-	-	-	-	-
0.6865	13900	0.0018	-	-	-	-	-
0.6914	14000	0.0001	-	-	-	-	-
0.6963	14100	0.0036	-	-	-	-	-
0.7013	14200	0.0015	-	-	-	-	-
0.7062	14300	0.0001	-	-	-	-	-
0.7111	14400	0.0074	-	-	-	-	-
0.7161	14500	0.0	-	-	-	-	-
0.7210	14600	0.0035	-	-	-	-	-
0.7260	14700	0.0016	-	-	-	-	-
0.7309	14800	0.0018	-	-	-	-	-
0.7358	14900	0.0022	-	-	-	-	-
0.7408	15000	0.0	0.9924	0.99	0.01	0.9896	0.0104
0.7457	15100	0.0023	-	-	-	-	-
0.7507	15200	0.0018	-	-	-	-	-
0.7556	15300	0.0	-	-	-	-	-
0.7605	15400	0.0	-	-	-	-	-
0.7655	15500	0.0036	-	-	-	-	-
0.7704	15600	0.0	-	-	-	-	-
0.7753	15700	0.0	-	-	-	-	-
0.7803	15800	0.0	-	-	-	-	-
0.7852	15900	0.0036	-	-	-	-	-
0.7902	16000	0.0018	-	-	-	-	-
0.7951	16100	0.0019	-	-	-	-	-
0.8000	16200	0.003	-	-	-	-	-
0.8050	16300	0.0034	-	-	-	-	-
0.8099	16400	0.0	-	-	-	-	-
0.8149	16500	0.0	-	-	-	-	-
0.8198	16600	0.006	-	-	-	-	-
0.8247	16700	0.0018	-	-	-	-	-
0.8297	16800	0.0029	-	-	-	-	-
0.8346	16900	0.0	-	-	-	-	-
0.8395	17000	0.0018	-	-	-	-	-
0.8445	17100	0.0014	-	-	-	-	-
0.8494	17200	0.0032	-	-	-	-	-
0.8544	17300	0.0	-	-	-	-	-
0.8593	17400	0.0063	-	-	-	-	-
0.8642	17500	0.0	-	-	-	-	-
0.8692	17600	0.0002	-	-	-	-	-
0.8741	17700	0.0026	-	-	-	-	-
0.8791	17800	0.0069	-	-	-	-	-
0.8840	17900	0.0026	-	-	-	-	-
0.8889	18000	0.0061	-	-	-	-	-
0.8939	18100	0.0	-	-	-	-	-
0.8988	18200	0.0026	-	-	-	-	-
0.9037	18300	0.0	-	-	-	-	-
0.9087	18400	0.0	-	-	-	-	-
0.9136	18500	0.0	-	-	-	-	-
0.9186	18600	0.0022	-	-	-	-	-
0.9235	18700	0.0	-	-	-	-	-
0.9284	18800	0.0	-	-	-	-	-
0.9334	18900	0.0016	-	-	-	-	-
0.9383	19000	0.0076	-	-	-	-	-
0.9433	19100	0.0	-	-	-	-	-
0.9482	19200	0.0026	-	-	-	-	-
0.9531	19300	0.0	-	-	-	-	-
0.9581	19400	0.0	-	-	-	-	-
0.9630	19500	0.0018	-	-	-	-	-
0.9679	19600	0.002	-	-	-	-	-
0.9729	19700	0.0042	-	-	-	-	-
0.9778	19800	0.0044	-	-	-	-	-
0.9828	19900	0.0	-	-	-	-	-
0.9877	20000	0.0	-	-	-	-	-
0.9926	20100	0.0	-	-	-	-	-
0.9976	20200	0.0018	-	-	-	-	-

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.14.2
Sentence Transformers: 5.1.2
Transformers: 4.57.3
PyTorch: 2.9.1+cu128
Accelerate: 1.12.0
Datasets: 4.4.1
Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}