Matryoshka Representation Learning
Paper • 2205.13147 • Published • 26
How to use Gonalb/flucold-ft-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Gonalb/flucold-ft-v2")
sentences = [
"QUESTION #2: What percentage of patients in the study reported experiencing \"chills\" and \"feverish discomfort\"?",
"been proven superior. Annual influenza vaccination is recommended for all people six months and older who do not have \ncontraindications. ( Am Fam Physician. 2019; 100(12):751-758. Copyright © 2019 American Academy of Family Physicians.)\nBEST PRACTICES IN INFECTIOUS DISEASE \nRecommendations from the Choosing \nWisely Campaign\nRecommendation Sponsoring organization\nDo not routinely avoid \ninfluenza vaccination in \negg-allergic patients.\nAmerican Academy of Allergy, \nAsthma, and Immunology\nSource: For more information on the Choosing Wisely Campaign,",
"Review\n722 Vol 5 November 2005\naccompanied by fever and some subjects have a transient\nfall in body temperature during the early stages of\ncommon cold. In a study of 272 patients with sore throat\nassociated with URTIs, the mean aural temperature was\n36·8ºC and around 35% of these patients said they were\nsuffering from “chills” and “feverish discomfort”.49 The\nsensation of chilliness may be unrelated to any change in\nskin or body temperature. In a study of human\nvolunteers, a sensation of chill still develops on\nadministration of exogenous pyrogen even though the",
"ered when the results will modify management or when a \npatient with signs or symptoms of influenza is hospitalized.19 \nTABLE 2\nComplications of Influenza\nCardiovascular 26\nCerebrovascular accidents\nIschemic heart disease\nMyocarditis\nHematologic 26\nHemolytic uremic syndrome\nHemophagocytic syndrome\nThrombotic thrombocytope -\nnic purpura\nMusculoskeletal 19,26\nMyositis\nRhabdomyolysis\nNeurologic 26\nAcute disseminated \nencephalomyelitis\nEncephalitis\nGuillain-Barré syndrome\nPostinfluenza encephalopathy \n(neurologic symptoms occur -\nring after resolution but within"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Gonalb/flucold-ft-v2")
# Run inference
sentences = [
'QUESTION #2: How does the sneeze centre in the brainstem coordinate the actions involved in sneezing?',
'causes sneezing.23 The trigeminal nerves relay\ninformation to the sneeze centre in the brainstem and\ncause reflex activation of motor and parasympathetic\nbranches of the facial nerve and activate respiratory\nmuscles. A model of the sneeze reflex is illustrated in\nfigure 1. The sneeze centre coordinates the inspiratory\nand expiratory actions of sneezing via respiratory\nmuscles, and lacrimation and nasal congestion via\nparasympathetic branches of the facial nerve. The eyes\nare always closed during sneezing by activation of facial\nmuscles, indicating a close relation between the',
'stroke, seizure disorder, dementia)\nAsthma or other chronic pulmonary disease\nChronic kidney disease\nChronic liver disease\nHeart disease (acquired or congenital)\nImmunosuppression (e.g., HIV infection, cancer, transplant \nrecipients, use of immunosuppressive medications)\nLong-term aspirin therapy in patients younger than 19 years\nMetabolic disorders (acquired [e.g., diabetes mellitus] or \ninherited [e.g., mitochondrial disorders])\nMorbid obesity\nSickle cell anemia and other hemoglobinopathies\nSpecial groups\nAdults 65 years and older\nAmerican Indians and Alaska Natives',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.6122 |
| cosine_accuracy@3 | 0.8878 |
| cosine_accuracy@5 | 0.9388 |
| cosine_accuracy@10 | 0.9898 |
| cosine_precision@1 | 0.6122 |
| cosine_precision@3 | 0.2959 |
| cosine_precision@5 | 0.1878 |
| cosine_precision@10 | 0.099 |
| cosine_recall@1 | 0.6122 |
| cosine_recall@3 | 0.8878 |
| cosine_recall@5 | 0.9388 |
| cosine_recall@10 | 0.9898 |
| cosine_ndcg@10 | 0.8165 |
| cosine_mrr@10 | 0.7593 |
| cosine_map@100 | 0.76 |
InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.61 |
| cosine_accuracy@3 | 0.86 |
| cosine_accuracy@5 | 0.91 |
| cosine_accuracy@10 | 0.98 |
| cosine_precision@1 | 0.61 |
| cosine_precision@3 | 0.2867 |
| cosine_precision@5 | 0.182 |
| cosine_precision@10 | 0.098 |
| cosine_recall@1 | 0.61 |
| cosine_recall@3 | 0.86 |
| cosine_recall@5 | 0.91 |
| cosine_recall@10 | 0.98 |
| cosine_ndcg@10 | 0.8057 |
| cosine_mrr@10 | 0.749 |
| cosine_map@100 | 0.7505 |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
What should individuals with asthma do if they experience flu symptoms? |
People with asthma who get flu symptoms should call their health care provider right |
What causes asthma attacks to occur in individuals with asthma? |
People with asthma who get flu symptoms should call their health care provider right |
QUESTION #1: How long are people with RSV typically contagious? |
second birthday. However, repeat infections may occur throughout life. |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
eval_strategy: stepsper_device_train_batch_size: 10per_device_eval_batch_size: 10num_train_epochs: 10multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 10per_device_eval_batch_size: 10per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss | cosine_ndcg@10 |
|---|---|---|---|
| 1.0 | 40 | - | 0.8359 |
| 1.25 | 50 | - | 0.8312 |
| 2.0 | 80 | - | 0.8304 |
| 2.5 | 100 | - | 0.8156 |
| 3.0 | 120 | - | 0.8016 |
| 3.75 | 150 | - | 0.7952 |
| 4.0 | 160 | - | 0.7880 |
| 5.0 | 200 | - | 0.8021 |
| 6.0 | 240 | - | 0.8215 |
| 6.25 | 250 | - | 0.8286 |
| 7.0 | 280 | - | 0.8079 |
| 7.5 | 300 | - | 0.8043 |
| 8.0 | 320 | - | 0.8126 |
| 8.75 | 350 | - | 0.8099 |
| 9.0 | 360 | - | 0.8126 |
| 10.0 | 400 | - | 0.8165 |
| 0.6173 | 50 | - | 0.8138 |
| 1.0 | 81 | - | 0.8158 |
| 1.2346 | 100 | - | 0.7932 |
| 1.8519 | 150 | - | 0.7989 |
| 2.0 | 162 | - | 0.7866 |
| 2.4691 | 200 | - | 0.8012 |
| 3.0 | 243 | - | 0.7803 |
| 3.0864 | 250 | - | 0.7969 |
| 3.7037 | 300 | - | 0.8030 |
| 4.0 | 324 | - | 0.7993 |
| 4.3210 | 350 | - | 0.7848 |
| 4.9383 | 400 | - | 0.7852 |
| 5.0 | 405 | - | 0.7814 |
| 5.5556 | 450 | - | 0.7975 |
| 6.0 | 486 | - | 0.7846 |
| 6.1728 | 500 | 0.314 | 0.7925 |
| 6.7901 | 550 | - | 0.7994 |
| 7.0 | 567 | - | 0.8069 |
| 7.4074 | 600 | - | 0.8048 |
| 8.0 | 648 | - | 0.8063 |
| 8.0247 | 650 | - | 0.8062 |
| 8.6420 | 700 | - | 0.7992 |
| 9.0 | 729 | - | 0.8115 |
| 9.2593 | 750 | - | 0.8118 |
| 9.8765 | 800 | - | 0.8057 |
| 10.0 | 810 | - | 0.8057 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
Snowflake/snowflake-arctic-embed-l