Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
How to use andrewma5/harvard-loop-reranker with sentence-transformers:
from sentence_transformers import CrossEncoder
model = CrossEncoder("andrewma5/harvard-loop-reranker")
query = "Which planet is known as the Red Planet?"
passages = [
"Venus is often called Earth's twin because of its similar size and proximity.",
"Mars, known for its reddish appearance, is often referred to as the Red Planet.",
"Jupiter, the largest planet in our solar system, has a prominent red spot.",
"Saturn, famous for its rings, is sometimes mistaken for the Red Planet."
]
scores = model.predict([(query, passage) for passage in passages])
print(scores)This is a Cross Encoder model finetuned from cross-encoder/ms-marco-MiniLM-L6-v2 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
['The item is a promotional display featuring a variety of phone cases, including solid blue cases, cases with artistic designs, and one showcasing a kitten wearing a Santa hat.', 'A black phone case.'],
['It was a black umbrella with a loop.', 'A new, mustard-yellow, waffle-knit long-sleeved henley shirt features a three-button placket, a chest pocket with a "Custom Supply" label, and an "L.O.G.G." tag at the neckline.'],
['A white sneaker with black, pink, and silver accents.', 'A blue backpack has an orange and white front with black straps.'],
['Oh, that sleek white TYESO tumbler with the silver top, I was just about to try it out for keeping my coffee warm all day.', 'It is a white, metal TYESO brand vacuum-insulated bottle/mug with a silver rim and a black lid with a clear straw.'],
['It is a bright orange backpack with a small pink strawberry charm.', 'The medium-sized black backpack, likely made of nylon or a similar synthetic material, features a white rectangular tag with "MUSIC IS POWER" printed on it and appears to be in good condition.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'The item is a promotional display featuring a variety of phone cases, including solid blue cases, cases with artistic designs, and one showcasing a kitten wearing a Santa hat.',
[
'A black phone case.',
'A new, mustard-yellow, waffle-knit long-sleeved henley shirt features a three-button placket, a chest pocket with a "Custom Supply" label, and an "L.O.G.G." tag at the neckline.',
'A blue backpack has an orange and white front with black straps.',
'It is a white, metal TYESO brand vacuum-insulated bottle/mug with a silver rim and a black lid with a clear straw.',
'The medium-sized black backpack, likely made of nylon or a similar synthetic material, features a white rectangular tag with "MUSIC IS POWER" printed on it and appears to be in good condition.',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
evalCEBinaryClassificationEvaluator| Metric | Value |
|---|---|
| accuracy | 0.8988 |
| accuracy_threshold | 0.1037 |
| f1 | 0.8318 |
| f1_threshold | -0.4537 |
| precision | 0.7978 |
| recall | 0.8688 |
| average_precision | 0.9072 |
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
The item is a promotional display featuring a variety of phone cases, including solid blue cases, cases with artistic designs, and one showcasing a kitten wearing a Santa hat. |
A black phone case. |
0.0 |
It was a black umbrella with a loop. |
A new, mustard-yellow, waffle-knit long-sleeved henley shirt features a three-button placket, a chest pocket with a "Custom Supply" label, and an "L.O.G.G." tag at the neckline. |
0.0 |
A white sneaker with black, pink, and silver accents. |
A blue backpack has an orange and white front with black straps. |
0.0 |
BinaryCrossEntropyLoss with these parameters:{
"activation_fn": "torch.nn.modules.linear.Identity",
"pos_weight": null
}
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | eval_average_precision |
|---|---|---|---|
| 0.0701 | 500 | 0.414 | 0.8339 |
| 0.1402 | 1000 | 0.3334 | 0.8344 |
| 0.2103 | 1500 | 0.2989 | 0.8549 |
| 0.2803 | 2000 | 0.2984 | 0.8596 |
| 0.3504 | 2500 | 0.2921 | 0.8707 |
| 0.4205 | 3000 | 0.2882 | 0.8734 |
| 0.4906 | 3500 | 0.2831 | 0.8802 |
| 0.5607 | 4000 | 0.2878 | 0.8828 |
| 0.6308 | 4500 | 0.2651 | 0.8857 |
| 0.7009 | 5000 | 0.2693 | 0.8854 |
| 0.7710 | 5500 | 0.2731 | 0.8876 |
| 0.8410 | 6000 | 0.2666 | 0.8905 |
| 0.9111 | 6500 | 0.2594 | 0.8925 |
| 0.9812 | 7000 | 0.2631 | 0.8956 |
| 1.0 | 7134 | - | 0.8921 |
| 1.0513 | 7500 | 0.2434 | 0.8955 |
| 1.1214 | 8000 | 0.2374 | 0.8969 |
| 1.1915 | 8500 | 0.2197 | 0.8962 |
| 1.2616 | 9000 | 0.2487 | 0.8980 |
| 1.3317 | 9500 | 0.2406 | 0.8990 |
| 1.4017 | 10000 | 0.2384 | 0.8995 |
| 1.4718 | 10500 | 0.2339 | 0.9021 |
| 1.5419 | 11000 | 0.2292 | 0.9034 |
| 1.6120 | 11500 | 0.2214 | 0.9046 |
| 1.6821 | 12000 | 0.2264 | 0.9049 |
| 1.7522 | 12500 | 0.2384 | 0.9058 |
| 1.8223 | 13000 | 0.2309 | 0.9072 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
microsoft/MiniLM-L12-H384-uncased