Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
9
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("krishmajumdar/arxiv-finetuned-v2")
# Run inference
sentences = [
'<S> the effect of a random phase diffuser on fluctuations of laser light ( scintillations ) is studied . </S> <S> not only spatial but also temporal phase variations introduced by the phase diffuser are analyzed . </S> <S> the explicit dependence of the scintillation index on finite - time phase variations is obtained for long propagation paths . </S> <S> it is shown that for large amplitudes of phase fluctuations , a finite - time effect decreases the ability of phase diffuser to suppress the scintillations . </S>',
'operators @xmath67 ( their dependence on time is as in vacuum ) . the term for @xmath68 can be obtained from eq . [ twelve ] by putting @xmath69 . substituting both distribution functions into eq . [ eight ] , we obtain @xmath70 @xmath71 @xmath72:\\big>,\\ ] ] where @xmath73 and @xmath74 are solutions of eqs . [ twelve ] with the initial conditions @xmath63 and @xmath75 , respectively . the operators on the right side of eq . [ thirteen ] are related through matching conditions with the amplitudes of the exiting laser radiation ( see ref . @xcite ) by the relation @xmath76 where @xmath77 is the operator of the laser field which is assumed to be a single - mode field and the subscript ( @xmath78 ) means perpendicular to the @xmath28-axis component . the function @xmath79 describes the profile of the laser mode , which is assumed to be gaussian - type function [ @xmath80 . @xmath1 desribes the initial radius of the beam . to account for the effect of the phase diffuser , a factor @xmath81 or @xmath82 should be inserted into the integrand of eq . [ fourteen ] . the quantity @xmath83 is the random phase introduced by the phase diffuser . a similar consideration is applicable to each of four photon operators entering both terms in square brackets of eq . [ thirteen ] . it can be easily seen that the factor @xmath84},\\ ] ] describing the effect of phase screen on the beam , enters implicitly the integrand of eq . [ thirteen ] ( the indices @xmath78 are omitted here for the sake of brevity ) . there are integrations over variables @xmath85 as shown in eq . [ fourteen ] . furthermore , the brackets @xmath16 ,',
'that the candidate is detected with s / n @xmath136 in the unaffected image and also s / n @xmath137 in the image affected by the bad pixel . hence , we are confident that the source is real and that the photometry from the final drizzled image is robust . the sixth and final candidate is confidently detected at s / n@xmath138 in @xmath46 ( @xmath120 ) , and also in the @xmath38 with s / n = 3.7 . its photometric redshift is sharply peaked at @xmath139 , with a secondary solution at @xmath140 . this candidate is also very compact , with measured half - light radius @xmath141 , and the highest stellarity of the sample ( class_star = 0.91 ) . combining compactness with high stellarity from a high s / n source , a stellar nature ( cool dwarf ) for this source is relatively likely , as we discuss in section [ contamination ] . to translate the results on the search of possible candidates at @xmath3 from the archival borg[z8 ] data into a number density / luminosity function determination , we need to assess both the impact of contamination in our sample , and the effective volume probed by the data . there are multiple classes of lower-@xmath24 sources that may have similar @xmath103 colors to @xmath19 lyman - break galaxies ( lbgs ) , such as galactic stars , intermediate - redshift passive galaxies , and strong line emitters . cool , red stars in the milky way may be possible contaminants of our sample , although typical colors lack a strong @xmath103 drop . at low signal - to - noise ratio , the separation of point - like galactic stars from resolved galaxies using the ` sextractor ` class_star',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, 0.5745, -0.0369],
# [ 0.5745, 1.0000, -0.0618],
# [-0.0369, -0.0618, 1.0000]])
abstract and article| abstract | article | |
|---|---|---|
| type | string | string |
| details |
|
|
| abstract | article |
|---|---|
|
additive models @xcite provide an important family of models for semiparametric regression or classification . some reasons for the success of additive models are their increased flexibility when compared to linear or generalized linear models and their increased interpretability when compared to fully nonparametric models . it is well - known that good estimators in additive models are in general less prone to the curse of high dimensionality than good estimators in fully nonparametric models . many examples of such estimators belong to the large class of regularized kernel based methods over a reproducing kernel hilbert space @xmath0 , see e.g. @xcite . in the last years many interesting results on learning rates of regularized kernel based models for additive models have been published when the focus is on sparsity and when the classical least squares loss function is used , see e.g. @xcite , @xcite , @xcite , @xcite , @xcite , @xcite and the references therein . of course , the lea... |
|
e.g. @xcite for the general case and @xcite for additive models . therefore , we will here consider the case of regularized kernel based methods based on a general convex and lipschitz continuous loss function , on a general kernel , and on the classical regularizing term @xmath1 for some @xmath2 which is a smoothness penalty but not a sparsity penalty , see e.g. @xcite . such regularized kernel based methods are now often called support vector machines ( svms ) , although the notation was historically used for such methods based on the special hinge loss function and for special kernels only , we refer to @xcite . in this paper we address the open question , whether an svm with an additive kernel can provide a substantially better learning rate in high dimensions than an svm with a general kernel , say a classical gaussian rbf kernel , if the assumption of an additive model is satisfied . our leading example covers learning rates for quantile regression based on the lipschitz continuo... |
|
approach might be to fit both models and compare their risks evaluated for test data . for the same reason we will also not cover sparsity . consistency of support vector machines generated by additive kernels for additive models was considered in @xcite . in this paper we establish learning rates for these algorithms . let us recall the framework with a complete separable metric space @xmath3 as the input space and a closed subset @xmath4 of @xmath5 as the output space . a borel probability measure @xmath6 on @xmath7 is used to model the learning problem and an independent and identically distributed sample @xmath8 is drawn according to @xmath6 for learning . a loss function @xmath9 is used to measure the quality of a prediction function @xmath10 by the local error @xmath11 . _ throughout the paper we assume that @xmath12 is measurable , @xmath13 , convex with respect to the third variable , and uniformly lipschitz continuous satisfying @xmath14 with a finite constant @xmath15 . _ sup... |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
per_device_train_batch_size: 32gradient_accumulation_steps: 2warmup_ratio: 0.05save_only_model: Truefp16: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 2eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.05warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Truerestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.0104 | 100 | 0.8589 |
| 0.0208 | 200 | 0.5171 |
| 0.0312 | 300 | 0.4745 |
| 0.0416 | 400 | 0.4498 |
| 0.0520 | 500 | 0.4105 |
| 0.0624 | 600 | 0.394 |
| 0.0729 | 700 | 0.3896 |
| 0.0833 | 800 | 0.3788 |
| 0.0937 | 900 | 0.3561 |
| 0.1041 | 1000 | 0.3662 |
| 0.1145 | 1100 | 0.3419 |
| 0.1249 | 1200 | 0.3256 |
| 0.1353 | 1300 | 0.3337 |
| 0.1457 | 1400 | 0.335 |
| 0.1561 | 1500 | 0.3255 |
| 0.1665 | 1600 | 0.3099 |
| 0.1769 | 1700 | 0.3092 |
| 0.1873 | 1800 | 0.2985 |
| 0.1978 | 1900 | 0.2931 |
| 0.2082 | 2000 | 0.2977 |
| 0.2186 | 2100 | 0.2918 |
| 0.2290 | 2200 | 0.2856 |
| 0.2394 | 2300 | 0.2835 |
| 0.2498 | 2400 | 0.2689 |
| 0.2602 | 2500 | 0.2743 |
| 0.2706 | 2600 | 0.2504 |
| 0.2810 | 2700 | 0.2423 |
| 0.2914 | 2800 | 0.2717 |
| 0.3018 | 2900 | 0.2653 |
| 0.3122 | 3000 | 0.2543 |
| 0.3226 | 3100 | 0.256 |
| 0.3331 | 3200 | 0.2555 |
| 0.3435 | 3300 | 0.2485 |
| 0.3539 | 3400 | 0.243 |
| 0.3643 | 3500 | 0.2339 |
| 0.3747 | 3600 | 0.2447 |
| 0.3851 | 3700 | 0.2311 |
| 0.3955 | 3800 | 0.2245 |
| 0.4059 | 3900 | 0.2276 |
| 0.4163 | 4000 | 0.2243 |
| 0.4267 | 4100 | 0.2225 |
| 0.4371 | 4200 | 0.2391 |
| 0.4475 | 4300 | 0.2162 |
| 0.4580 | 4400 | 0.2194 |
| 0.4684 | 4500 | 0.2291 |
| 0.4788 | 4600 | 0.2307 |
| 0.4892 | 4700 | 0.2141 |
| 0.4996 | 4800 | 0.2124 |
| 0.5100 | 4900 | 0.2306 |
| 0.5204 | 5000 | 0.2075 |
| 0.5308 | 5100 | 0.2055 |
| 0.5412 | 5200 | 0.2294 |
| 0.5516 | 5300 | 0.2165 |
| 0.5620 | 5400 | 0.2165 |
| 0.5724 | 5500 | 0.1957 |
| 0.5828 | 5600 | 0.1971 |
| 0.5933 | 5700 | 0.1935 |
| 0.6037 | 5800 | 0.2077 |
| 0.6141 | 5900 | 0.1931 |
| 0.6245 | 6000 | 0.1987 |
| 0.6349 | 6100 | 0.1983 |
| 0.6453 | 6200 | 0.1889 |
| 0.6557 | 6300 | 0.1894 |
| 0.6661 | 6400 | 0.195 |
| 0.6765 | 6500 | 0.1936 |
| 0.6869 | 6600 | 0.1811 |
| 0.6973 | 6700 | 0.1835 |
| 0.7077 | 6800 | 0.2028 |
| 0.7182 | 6900 | 0.1904 |
| 0.7286 | 7000 | 0.1853 |
| 0.7390 | 7100 | 0.1646 |
| 0.7494 | 7200 | 0.1904 |
| 0.7598 | 7300 | 0.181 |
| 0.7702 | 7400 | 0.176 |
| 0.7806 | 7500 | 0.1746 |
| 0.7910 | 7600 | 0.1846 |
| 0.8014 | 7700 | 0.1706 |
| 0.8118 | 7800 | 0.1692 |
| 0.8222 | 7900 | 0.1696 |
| 0.8326 | 8000 | 0.171 |
| 0.0104 | 100 | 0.2682 |
| 0.0208 | 200 | 0.1698 |
| 0.0312 | 300 | 0.1492 |
| 0.0416 | 400 | 0.1597 |
| 0.0520 | 500 | 0.1421 |
| 0.0624 | 600 | 0.1412 |
| 0.0729 | 700 | 0.1367 |
| 0.0833 | 800 | 0.1407 |
| 0.0937 | 900 | 0.1276 |
| 0.1041 | 1000 | 0.1352 |
| 0.1145 | 1100 | 0.1307 |
| 0.1249 | 1200 | 0.1188 |
| 0.1353 | 1300 | 0.1211 |
| 0.1457 | 1400 | 0.1203 |
| 0.1561 | 1500 | 0.1131 |
| 0.1665 | 1600 | 0.1077 |
| 0.1769 | 1700 | 0.1061 |
| 0.1873 | 1800 | 0.1064 |
| 0.1978 | 1900 | 0.1016 |
| 0.2082 | 2000 | 0.1066 |
| 0.2186 | 2100 | 0.1077 |
| 0.2290 | 2200 | 0.1009 |
| 0.2394 | 2300 | 0.1048 |
| 0.2498 | 2400 | 0.0925 |
| 0.2602 | 2500 | 0.1054 |
| 0.2706 | 2600 | 0.0873 |
| 0.2810 | 2700 | 0.082 |
| 0.2914 | 2800 | 0.0976 |
| 0.3018 | 2900 | 0.097 |
| 0.3122 | 3000 | 0.0876 |
| 0.3226 | 3100 | 0.0959 |
| 0.3331 | 3200 | 0.0931 |
| 0.3435 | 3300 | 0.0903 |
| 0.3539 | 3400 | 0.0854 |
| 0.3643 | 3500 | 0.0841 |
| 0.3747 | 3600 | 0.0914 |
| 0.3851 | 3700 | 0.0809 |
| 0.3955 | 3800 | 0.0798 |
| 0.4059 | 3900 | 0.0847 |
| 0.4163 | 4000 | 0.0784 |
| 0.4267 | 4100 | 0.0837 |
| 0.4371 | 4200 | 0.092 |
| 0.4475 | 4300 | 0.0794 |
| 0.4580 | 4400 | 0.0811 |
| 0.4684 | 4500 | 0.0844 |
| 0.4788 | 4600 | 0.092 |
| 0.4892 | 4700 | 0.0743 |
| 0.4996 | 4800 | 0.0839 |
| 0.5100 | 4900 | 0.0939 |
| 0.5204 | 5000 | 0.0789 |
| 0.5308 | 5100 | 0.0769 |
| 0.5412 | 5200 | 0.0936 |
| 0.5516 | 5300 | 0.085 |
| 0.5620 | 5400 | 0.0857 |
| 0.5724 | 5500 | 0.0731 |
| 0.5828 | 5600 | 0.0766 |
| 0.5933 | 5700 | 0.078 |
| 0.6037 | 5800 | 0.0812 |
| 0.6141 | 5900 | 0.0731 |
| 0.6245 | 6000 | 0.0783 |
| 0.6349 | 6100 | 0.075 |
| 0.6453 | 6200 | 0.0734 |
| 0.6557 | 6300 | 0.0725 |
| 0.6661 | 6400 | 0.0796 |
| 0.6765 | 6500 | 0.0748 |
| 0.6869 | 6600 | 0.0722 |
| 0.6973 | 6700 | 0.0705 |
| 0.7077 | 6800 | 0.0831 |
| 0.7182 | 6900 | 0.0787 |
| 0.7286 | 7000 | 0.0779 |
| 0.7390 | 7100 | 0.0641 |
| 0.7494 | 7200 | 0.0795 |
| 0.7598 | 7300 | 0.0712 |
| 0.7702 | 7400 | 0.0698 |
| 0.7806 | 7500 | 0.068 |
| 0.7910 | 7600 | 0.0729 |
| 0.8014 | 7700 | 0.0693 |
| 0.8118 | 7800 | 0.0719 |
| 0.8222 | 7900 | 0.0735 |
| 0.8326 | 8000 | 0.073 |
| 0.8430 | 8100 | 0.1425 |
| 0.8535 | 8200 | 0.1422 |
| 0.8639 | 8300 | 0.1336 |
| 0.8743 | 8400 | 0.1448 |
| 0.8847 | 8500 | 0.1421 |
| 0.8951 | 8600 | 0.143 |
| 0.9055 | 8700 | 0.1299 |
| 0.9159 | 8800 | 0.1337 |
| 0.9263 | 8900 | 0.138 |
| 0.9367 | 9000 | 0.1417 |
| 0.9471 | 9100 | 0.1266 |
| 0.9575 | 9200 | 0.1187 |
| 0.9679 | 9300 | 0.1454 |
| 0.9784 | 9400 | 0.1322 |
| 0.9888 | 9500 | 0.137 |
| 0.9992 | 9600 | 0.1452 |
| 1.0096 | 9700 | 0.0936 |
| 1.0200 | 9800 | 0.0986 |
| 1.0304 | 9900 | 0.1021 |
| 1.0408 | 10000 | 0.1004 |
| 1.0512 | 10100 | 0.0954 |
| 1.0616 | 10200 | 0.1004 |
| 1.0720 | 10300 | 0.0974 |
| 1.0824 | 10400 | 0.0939 |
| 1.0928 | 10500 | 0.1039 |
| 1.1032 | 10600 | 0.111 |
| 1.1137 | 10700 | 0.0993 |
| 1.1241 | 10800 | 0.0975 |
| 1.1345 | 10900 | 0.0939 |
| 1.1449 | 11000 | 0.1042 |
| 1.1553 | 11100 | 0.0984 |
| 1.1657 | 11200 | 0.1008 |
| 1.1761 | 11300 | 0.0977 |
| 1.1865 | 11400 | 0.0881 |
| 1.1969 | 11500 | 0.0971 |
| 1.2073 | 11600 | 0.0909 |
| 1.2177 | 11700 | 0.0938 |
| 1.2281 | 11800 | 0.0933 |
| 1.2386 | 11900 | 0.1035 |
| 1.2490 | 12000 | 0.0931 |
| 1.2594 | 12100 | 0.1053 |
| 1.2698 | 12200 | 0.1043 |
| 1.2802 | 12300 | 0.0935 |
| 1.2906 | 12400 | 0.0928 |
| 1.3010 | 12500 | 0.0969 |
| 1.3114 | 12600 | 0.0901 |
| 1.3218 | 12700 | 0.0992 |
| 1.3322 | 12800 | 0.0978 |
| 1.3426 | 12900 | 0.0901 |
| 1.3530 | 13000 | 0.0835 |
| 1.3634 | 13100 | 0.0914 |
| 1.3739 | 13200 | 0.0922 |
| 1.3843 | 13300 | 0.0923 |
| 1.3947 | 13400 | 0.0917 |
| 1.4051 | 13500 | 0.089 |
| 1.4155 | 13600 | 0.0903 |
| 1.4259 | 13700 | 0.0913 |
| 1.4363 | 13800 | 0.093 |
| 1.4467 | 13900 | 0.0909 |
| 1.4571 | 14000 | 0.0906 |
| 1.4675 | 14100 | 0.0903 |
| 1.4779 | 14200 | 0.0946 |
| 1.4883 | 14300 | 0.0933 |
| 1.4988 | 14400 | 0.0898 |
| 1.5092 | 14500 | 0.088 |
| 1.5196 | 14600 | 0.0961 |
| 1.5300 | 14700 | 0.0887 |
| 1.5404 | 14800 | 0.0858 |
| 1.5508 | 14900 | 0.0878 |
| 1.5612 | 15000 | 0.092 |
| 1.5716 | 15100 | 0.0857 |
| 1.5820 | 15200 | 0.0878 |
| 1.5924 | 15300 | 0.0856 |
| 1.6028 | 15400 | 0.0887 |
| 1.6132 | 15500 | 0.0837 |
| 1.6236 | 15600 | 0.0832 |
| 1.6341 | 15700 | 0.083 |
| 1.6445 | 15800 | 0.0906 |
| 1.6549 | 15900 | 0.0844 |
| 1.6653 | 16000 | 0.085 |
| 1.6757 | 16100 | 0.0837 |
| 1.6861 | 16200 | 0.0826 |
| 1.6965 | 16300 | 0.0867 |
| 1.7069 | 16400 | 0.0902 |
| 1.7173 | 16500 | 0.0864 |
| 1.7277 | 16600 | 0.0882 |
| 1.7381 | 16700 | 0.0894 |
| 1.7485 | 16800 | 0.0902 |
| 1.7590 | 16900 | 0.0813 |
| 1.7694 | 17000 | 0.0821 |
| 1.7798 | 17100 | 0.0863 |
| 1.7902 | 17200 | 0.0828 |
| 1.8006 | 17300 | 0.0902 |
| 1.8110 | 17400 | 0.0831 |
| 1.8214 | 17500 | 0.0765 |
| 1.8318 | 17600 | 0.0806 |
| 1.8422 | 17700 | 0.0793 |
| 1.8526 | 17800 | 0.0842 |
| 1.8630 | 17900 | 0.0828 |
| 1.8734 | 18000 | 0.085 |
| 1.8838 | 18100 | 0.0803 |
| 1.8943 | 18200 | 0.0772 |
| 1.9047 | 18300 | 0.0865 |
| 1.9151 | 18400 | 0.0847 |
| 1.9255 | 18500 | 0.0835 |
| 1.9359 | 18600 | 0.0818 |
| 1.9463 | 18700 | 0.0757 |
| 1.9567 | 18800 | 0.0772 |
| 1.9671 | 18900 | 0.0854 |
| 1.9775 | 19000 | 0.0813 |
| 1.9879 | 19100 | 0.0844 |
| 1.9983 | 19200 | 0.0793 |
| 2.0087 | 19300 | 0.0668 |
| 2.0192 | 19400 | 0.0647 |
| 2.0296 | 19500 | 0.0702 |
| 2.0400 | 19600 | 0.0703 |
| 2.0504 | 19700 | 0.0641 |
| 2.0608 | 19800 | 0.0768 |
| 2.0712 | 19900 | 0.0632 |
| 2.0816 | 20000 | 0.0633 |
| 2.0920 | 20100 | 0.0608 |
| 2.1024 | 20200 | 0.0684 |
| 2.1128 | 20300 | 0.0618 |
| 2.1232 | 20400 | 0.063 |
| 2.1336 | 20500 | 0.0625 |
| 2.1440 | 20600 | 0.0631 |
| 2.1545 | 20700 | 0.0681 |
| 2.1649 | 20800 | 0.0584 |
| 2.1753 | 20900 | 0.0655 |
| 2.1857 | 21000 | 0.0651 |
| 2.1961 | 21100 | 0.0699 |
| 2.2065 | 21200 | 0.0704 |
| 2.2169 | 21300 | 0.0686 |
| 2.2273 | 21400 | 0.0655 |
| 2.2377 | 21500 | 0.063 |
| 2.2481 | 21600 | 0.0657 |
| 2.2585 | 21700 | 0.0694 |
| 2.2689 | 21800 | 0.066 |
| 2.2794 | 21900 | 0.0677 |
| 2.2898 | 22000 | 0.0617 |
| 2.3002 | 22100 | 0.0612 |
| 2.3106 | 22200 | 0.06 |
| 2.3210 | 22300 | 0.0572 |
| 2.3314 | 22400 | 0.0642 |
| 2.3418 | 22500 | 0.0601 |
| 2.3522 | 22600 | 0.0581 |
| 2.3626 | 22700 | 0.0702 |
| 2.3730 | 22800 | 0.0614 |
| 2.3834 | 22900 | 0.0631 |
| 2.3938 | 23000 | 0.0586 |
| 2.4042 | 23100 | 0.0638 |
| 2.4147 | 23200 | 0.0584 |
| 2.4251 | 23300 | 0.068 |
| 2.4355 | 23400 | 0.0681 |
| 2.4459 | 23500 | 0.0616 |
| 2.4563 | 23600 | 0.0604 |
| 2.4667 | 23700 | 0.0618 |
| 2.4771 | 23800 | 0.0603 |
| 2.4875 | 23900 | 0.0643 |
| 2.4979 | 24000 | 0.0639 |
| 2.5083 | 24100 | 0.0656 |
| 2.5187 | 24200 | 0.0578 |
| 2.5291 | 24300 | 0.0613 |
| 2.5396 | 24400 | 0.061 |
| 2.5500 | 24500 | 0.0578 |
| 2.5604 | 24600 | 0.059 |
| 2.5708 | 24700 | 0.0586 |
| 2.5812 | 24800 | 0.0532 |
| 2.5916 | 24900 | 0.0547 |
| 2.6020 | 25000 | 0.0596 |
| 2.6124 | 25100 | 0.0614 |
| 2.6228 | 25200 | 0.0547 |
| 2.6332 | 25300 | 0.056 |
| 2.6436 | 25400 | 0.0578 |
| 2.6540 | 25500 | 0.0611 |
| 2.6644 | 25600 | 0.0605 |
| 2.6749 | 25700 | 0.062 |
| 2.6853 | 25800 | 0.0601 |
| 2.6957 | 25900 | 0.0618 |
| 2.7061 | 26000 | 0.055 |
| 2.7165 | 26100 | 0.0614 |
| 2.7269 | 26200 | 0.0553 |
| 2.7373 | 26300 | 0.0587 |
| 2.7477 | 26400 | 0.0629 |
| 2.7581 | 26500 | 0.0559 |
| 2.7685 | 26600 | 0.0559 |
| 2.7789 | 26700 | 0.0533 |
| 2.7893 | 26800 | 0.0591 |
| 2.7998 | 26900 | 0.0526 |
| 2.8102 | 27000 | 0.0548 |
| 2.8206 | 27100 | 0.0562 |
| 2.8310 | 27200 | 0.0577 |
| 2.8414 | 27300 | 0.0611 |
| 2.8518 | 27400 | 0.0565 |
| 2.8622 | 27500 | 0.0627 |
| 2.8726 | 27600 | 0.0604 |
| 2.8830 | 27700 | 0.0578 |
| 2.8934 | 27800 | 0.0564 |
| 2.9038 | 27900 | 0.0591 |
| 2.9142 | 28000 | 0.0566 |
| 2.9246 | 28100 | 0.0541 |
| 2.9351 | 28200 | 0.0544 |
| 2.9455 | 28300 | 0.0598 |
| 2.9559 | 28400 | 0.0592 |
| 2.9663 | 28500 | 0.0559 |
| 2.9767 | 28600 | 0.0578 |
| 2.9871 | 28700 | 0.055 |
| 2.9975 | 28800 | 0.0509 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
sentence-transformers/all-MiniLM-L6-v2