SentenceTransformer based on google-t5/t5-base

This is a sentence-transformers model finetuned from google-t5/t5-base on the all-nli dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: google-t5/t5-base
  • Maximum Sequence Length: None tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': None, 'do_lower_case': False, 'architecture': 'T5EncoderModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'A construction worker peeking out of a manhole while his coworker sits on the sidewalk smiling.',
    'A worker is looking out of a manhole.',
    'The workers are both inside the manhole.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric sts-dev sts-test
pearson_cosine 0.845 0.8327
spearman_cosine 0.8464 0.8431

Training Details

Training Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 557,850 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 6 tokens
    • mean: 9.96 tokens
    • max: 52 tokens
    • min: 5 tokens
    • mean: 12.79 tokens
    • max: 44 tokens
    • min: 4 tokens
    • mean: 14.02 tokens
    • max: 57 tokens
  • Samples:
    anchor positive negative
    A person on a horse jumps over a broken down airplane. A person is outdoors, on a horse. A person is at a diner, ordering an omelette.
    Children smiling and waving at camera There are children present The kids are frowning
    A boy is jumping on skateboard in the middle of a red bridge. The boy does a skateboarding trick. The boy skates down the sidewalk.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768
        ],
        "matryoshka_weights": [
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 6,584 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 5 tokens
    • mean: 19.41 tokens
    • max: 79 tokens
    • min: 4 tokens
    • mean: 9.69 tokens
    • max: 35 tokens
    • min: 4 tokens
    • mean: 10.35 tokens
    • max: 30 tokens
  • Samples:
    anchor positive negative
    Two women are embracing while holding to go packages. Two woman are holding packages. The men are fighting outside a deli.
    Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink. Two kids in numbered jerseys wash their hands. Two kids in jackets walk to school.
    A man selling donuts to a customer during a world exhibition event held in the city of Angeles A man selling donuts to a customer. A woman drinks her coffee in a small cafe.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768
        ],
        "matryoshka_weights": [
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 15
  • warmup_ratio: 0.1

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts-dev_spearman_cosine sts-test_spearman_cosine
-1 -1 - - 0.6510 -
0.0287 500 2.5308 1.3756 0.6760 -
0.0574 1000 2.1199 1.0631 0.7312 -
0.0860 1500 1.6381 0.8444 0.7651 -
0.1147 2000 1.3407 0.7479 0.7763 -
0.1434 2500 1.166 0.6835 0.7835 -
0.1721 3000 1.0809 0.6264 0.7883 -
0.2008 3500 1.0059 0.5808 0.7916 -
0.2294 4000 0.9292 0.5431 0.7959 -
0.2581 4500 0.8938 0.5134 0.7996 -
0.2868 5000 0.843 0.4859 0.8032 -
0.3155 5500 0.7958 0.4636 0.8064 -
0.3442 6000 0.7612 0.4427 0.8105 -
0.3729 6500 0.7471 0.4230 0.8132 -
0.4015 7000 0.7155 0.4045 0.8146 -
0.4302 7500 0.6817 0.3918 0.8169 -
0.4589 8000 0.633 0.3781 0.8197 -
0.4876 8500 0.6551 0.3675 0.8206 -
0.5163 9000 0.6403 0.3585 0.8213 -
0.5449 9500 0.6128 0.3475 0.8229 -
0.5736 10000 0.5791 0.3394 0.8250 -
0.6023 10500 0.5819 0.3298 0.8257 -
0.6310 11000 0.559 0.3249 0.8286 -
0.6597 11500 0.5545 0.3163 0.8273 -
0.6883 12000 0.5377 0.3094 0.8314 -
0.7170 12500 0.5308 0.3029 0.8291 -
0.7457 13000 0.5203 0.2968 0.8304 -
0.7744 13500 0.5244 0.2946 0.8284 -
0.8031 14000 0.5046 0.2893 0.8294 -
0.8318 14500 0.4901 0.2843 0.8292 -
0.8604 15000 0.4917 0.2810 0.8319 -
0.8891 15500 0.4915 0.2757 0.8350 -
0.9178 16000 0.4834 0.2711 0.8343 -
0.9465 16500 0.4731 0.2692 0.8343 -
0.9752 17000 0.4547 0.2671 0.8353 -
1.0038 17500 0.4487 0.2640 0.8399 -
1.0325 18000 0.4331 0.2658 0.8368 -
1.0612 18500 0.4304 0.2654 0.8350 -
1.0899 19000 0.429 0.2603 0.8366 -
1.1186 19500 0.4207 0.2591 0.8386 -
1.1472 20000 0.4236 0.2570 0.8425 -
1.1759 20500 0.4072 0.2527 0.8435 -
1.2046 21000 0.4196 0.2533 0.8430 -
1.2333 21500 0.4117 0.2503 0.8417 -
1.2620 22000 0.3988 0.2486 0.8446 -
1.2907 22500 0.4016 0.2465 0.8453 -
1.3193 23000 0.398 0.2428 0.8438 -
1.3480 23500 0.3993 0.2491 0.8462 -
1.3767 24000 0.3893 0.2446 0.8452 -
1.4054 24500 0.3784 0.2414 0.8482 -
1.4341 25000 0.3789 0.2411 0.8464 -
1.4627 25500 0.3822 0.2368 0.8457 -
1.4914 26000 0.3736 0.2371 0.8456 -
1.5201 26500 0.3598 0.2358 0.8446 -
1.5488 27000 0.3525 0.2368 0.8469 -
1.5775 27500 0.3747 0.2307 0.8478 -
1.6061 28000 0.3649 0.2357 0.8486 -
1.6348 28500 0.355 0.2327 0.8475 -
1.6635 29000 0.3501 0.2347 0.8446 -
1.6922 29500 0.347 0.2263 0.8499 -
1.7209 30000 0.3565 0.2320 0.8490 -
1.7496 30500 0.3432 0.2271 0.8489 -
1.7782 31000 0.3409 0.2257 0.8483 -
1.8069 31500 0.3522 0.2261 0.8510 -
1.8356 32000 0.3348 0.2270 0.8504 -
1.8643 32500 0.339 0.2199 0.8495 -
1.8930 33000 0.333 0.2219 0.8502 -
1.9216 33500 0.3103 0.2208 0.8514 -
1.9503 34000 0.338 0.2240 0.8541 -
1.9790 34500 0.3276 0.2232 0.8548 -
2.0077 35000 0.3161 0.2248 0.8512 -
2.0364 35500 0.2905 0.2262 0.8528 -
2.0650 36000 0.2928 0.2216 0.8538 -
2.0937 36500 0.2961 0.2175 0.8515 -
2.1224 37000 0.2926 0.2184 0.8543 -
2.1511 37500 0.2923 0.2155 0.8466 -
2.1798 38000 0.2963 0.2201 0.8474 -
2.2085 38500 0.2853 0.2165 0.8492 -
2.2371 39000 0.2778 0.2212 0.8483 -
2.2658 39500 0.2889 0.2170 0.8532 -
2.2945 40000 0.2683 0.2201 0.8499 -
2.3232 40500 0.2812 0.2139 0.8548 -
2.3519 41000 0.2781 0.2193 0.8513 -
2.3805 41500 0.2848 0.2224 0.8482 -
2.4092 42000 0.2725 0.2232 0.8499 -
2.4379 42500 0.2727 0.2209 0.8525 -
2.4666 43000 0.2747 0.2254 0.8535 -
2.4953 43500 0.2785 0.2199 0.8528 -
2.5239 44000 0.2701 0.2181 0.8494 -
2.5526 44500 0.273 0.2195 0.8519 -
2.5813 45000 0.2763 0.2160 0.8540 -
2.6100 45500 0.2635 0.2140 0.8534 -
2.6387 46000 0.2722 0.2176 0.8529 -
2.6674 46500 0.2596 0.2136 0.8552 -
2.6960 47000 0.2627 0.2142 0.8524 -
2.7247 47500 0.2673 0.2174 0.8502 -
2.7534 48000 0.2582 0.2147 0.8510 -
2.7821 48500 0.256 0.2148 0.8514 -
2.8108 49000 0.2567 0.2122 0.8524 -
2.8394 49500 0.2502 0.2142 0.8526 -
2.8681 50000 0.2514 0.2132 0.8521 -
2.8968 50500 0.2603 0.2134 0.8496 -
2.9255 51000 0.2544 0.2131 0.8520 -
2.9542 51500 0.2526 0.2132 0.8511 -
2.9828 52000 0.2483 0.2131 0.8491 -
3.0115 52500 0.2361 0.2157 0.8522 -
3.0402 53000 0.2258 0.2122 0.8504 -
3.0689 53500 0.2248 0.2138 0.8498 -
3.0976 54000 0.2293 0.2164 0.8497 -
3.1263 54500 0.2328 0.2141 0.8501 -
3.1549 55000 0.2216 0.2109 0.8512 -
3.1836 55500 0.2256 0.2143 0.8529 -
3.2123 56000 0.2189 0.2126 0.8497 -
3.2410 56500 0.2264 0.2131 0.8492 -
3.2697 57000 0.2183 0.2129 0.8478 -
3.2983 57500 0.2155 0.2132 0.8515 -
3.3270 58000 0.2192 0.2133 0.8512 -
3.3557 58500 0.221 0.2132 0.8530 -
3.3844 59000 0.2139 0.2146 0.8522 -
3.4131 59500 0.2201 0.2139 0.8533 -
3.4417 60000 0.2058 0.2111 0.8505 -
3.4704 60500 0.2129 0.2116 0.8535 -
3.4991 61000 0.2137 0.2127 0.8513 -
3.5278 61500 0.2074 0.2129 0.8500 -
3.5565 62000 0.2157 0.2135 0.8506 -
3.5852 62500 0.2169 0.2153 0.8491 -
3.6138 63000 0.2052 0.2122 0.8527 -
3.6425 63500 0.2286 0.2144 0.8517 -
3.6712 64000 0.2105 0.2141 0.8491 -
3.6999 64500 0.2117 0.2147 0.8529 -
3.7286 65000 0.2117 0.2148 0.8510 -
3.7572 65500 0.2167 0.2158 0.8523 -
3.7859 66000 0.2176 0.2151 0.8506 -
3.8146 66500 0.2135 0.2131 0.8509 -
3.8433 67000 0.2148 0.2162 0.8532 -
3.8720 67500 0.2049 0.2128 0.8535 -
3.9006 68000 0.2133 0.2132 0.8511 -
3.9293 68500 0.2015 0.2177 0.8473 -
3.9580 69000 0.205 0.2165 0.8517 -
3.9867 69500 0.2032 0.2142 0.8548 -
4.0154 70000 0.1992 0.2173 0.8563 -
4.0441 70500 0.1851 0.2173 0.8541 -
4.0727 71000 0.1796 0.2154 0.8547 -
4.1014 71500 0.1836 0.2155 0.8544 -
4.1301 72000 0.181 0.2207 0.8513 -
4.1588 72500 0.1923 0.2198 0.8522 -
4.1875 73000 0.1865 0.2201 0.8533 -
4.2161 73500 0.1795 0.2139 0.8548 -
4.2448 74000 0.1813 0.2180 0.8512 -
4.2735 74500 0.1788 0.2147 0.8509 -
4.3022 75000 0.179 0.2133 0.8510 -
4.3309 75500 0.1796 0.2162 0.8519 -
4.3595 76000 0.1912 0.2185 0.8509 -
4.3882 76500 0.184 0.2162 0.8535 -
4.4169 77000 0.1827 0.2148 0.8535 -
4.4456 77500 0.1786 0.2146 0.8532 -
4.4743 78000 0.1826 0.2146 0.8534 -
4.5030 78500 0.1821 0.2165 0.8525 -
4.5316 79000 0.1781 0.2122 0.8524 -
4.5603 79500 0.1832 0.2147 0.8534 -
4.5890 80000 0.1812 0.2185 0.8534 -
4.6177 80500 0.1839 0.2163 0.8554 -
4.6464 81000 0.1834 0.2158 0.8542 -
4.6750 81500 0.1805 0.2171 0.8523 -
4.7037 82000 0.1818 0.2186 0.8536 -
4.7324 82500 0.1736 0.2176 0.8555 -
4.7611 83000 0.1754 0.2161 0.8525 -
4.7898 83500 0.1808 0.2187 0.8546 -
4.8184 84000 0.1794 0.2138 0.8543 -
4.8471 84500 0.1827 0.2147 0.8529 -
4.8758 85000 0.1773 0.2126 0.8535 -
4.9045 85500 0.1806 0.2124 0.8539 -
4.9332 86000 0.1843 0.2131 0.8490 -
4.9619 86500 0.1738 0.2120 0.8548 -
4.9905 87000 0.174 0.2120 0.8541 -
5.0192 87500 0.1644 0.2143 0.8533 -
5.0479 88000 0.1607 0.2132 0.8529 -
5.0766 88500 0.1539 0.2170 0.8504 -
5.1053 89000 0.1581 0.2123 0.8496 -
5.1339 89500 0.1563 0.2114 0.8519 -
5.1626 90000 0.158 0.2123 0.8494 -
5.1913 90500 0.1653 0.2147 0.8513 -
5.2200 91000 0.158 0.2170 0.8488 -
5.2487 91500 0.1559 0.2144 0.8489 -
5.2773 92000 0.165 0.2123 0.8498 -
5.3060 92500 0.15 0.2128 0.8497 -
5.3347 93000 0.1603 0.2126 0.8501 -
5.3634 93500 0.1584 0.2119 0.8490 -
5.3921 94000 0.1626 0.2121 0.8513 -
5.4208 94500 0.1585 0.2128 0.8505 -
5.4494 95000 0.1593 0.2099 0.8518 -
5.4781 95500 0.1546 0.2109 0.8482 -
5.5068 96000 0.1554 0.2115 0.8509 -
5.5355 96500 0.1628 0.2099 0.8507 -
5.5642 97000 0.1558 0.2124 0.8514 -
5.5928 97500 0.1556 0.2113 0.8510 -
5.6215 98000 0.1511 0.2091 0.8527 -
5.6502 98500 0.1545 0.2110 0.8518 -
5.6789 99000 0.1548 0.2113 0.8513 -
5.7076 99500 0.1556 0.2119 0.8516 -
5.7362 100000 0.1638 0.2113 0.8497 -
5.7649 100500 0.1516 0.2116 0.8500 -
5.7936 101000 0.1518 0.2124 0.8487 -
5.8223 101500 0.1562 0.2136 0.8485 -
5.8510 102000 0.1566 0.2132 0.8484 -
5.8797 102500 0.1517 0.2123 0.8487 -
5.9083 103000 0.1618 0.2114 0.8478 -
5.9370 103500 0.1531 0.2105 0.8487 -
5.9657 104000 0.1588 0.2106 0.8492 -
5.9944 104500 0.1563 0.2115 0.8486 -
6.0231 105000 0.151 0.2115 0.8505 -
6.0517 105500 0.1414 0.2111 0.8503 -
6.0804 106000 0.1316 0.2119 0.8491 -
6.1091 106500 0.1431 0.2108 0.8501 -
6.1378 107000 0.1336 0.2121 0.8508 -
6.1665 107500 0.1367 0.2111 0.8500 -
6.1951 108000 0.1397 0.2134 0.8496 -
6.2238 108500 0.1463 0.2128 0.8508 -
6.2525 109000 0.1416 0.2133 0.8525 -
6.2812 109500 0.147 0.2153 0.8504 -
6.3099 110000 0.1398 0.2133 0.8509 -
6.3386 110500 0.1451 0.2117 0.8534 -
6.3672 111000 0.1279 0.2129 0.8500 -
6.3959 111500 0.1309 0.2152 0.8484 -
6.4246 112000 0.1353 0.2128 0.8514 -
6.4533 112500 0.1355 0.2124 0.8501 -
6.4820 113000 0.1377 0.2141 0.8498 -
6.5106 113500 0.1376 0.2154 0.8522 -
6.5393 114000 0.1375 0.2148 0.8518 -
6.5680 114500 0.1348 0.2160 0.8507 -
6.5967 115000 0.1416 0.2139 0.8511 -
6.6254 115500 0.1396 0.2136 0.8503 -
6.6540 116000 0.1419 0.2130 0.8519 -
6.6827 116500 0.1435 0.2131 0.8527 -
6.7114 117000 0.1318 0.2128 0.8526 -
6.7401 117500 0.1362 0.2136 0.8530 -
6.7688 118000 0.1374 0.2125 0.8515 -
6.7975 118500 0.1384 0.2116 0.8528 -
6.8261 119000 0.1419 0.2122 0.8514 -
6.8548 119500 0.1303 0.2134 0.8506 -
6.8835 120000 0.1341 0.2125 0.8531 -
6.9122 120500 0.1342 0.2118 0.8512 -
6.9409 121000 0.1419 0.2107 0.8504 -
6.9695 121500 0.138 0.2108 0.8510 -
6.9982 122000 0.1318 0.2126 0.8525 -
7.0269 122500 0.1294 0.2138 0.8491 -
7.0556 123000 0.1355 0.2137 0.8504 -
7.0843 123500 0.1222 0.2148 0.8506 -
7.1129 124000 0.1231 0.2137 0.8530 -
7.1416 124500 0.1303 0.2142 0.8530 -
7.1703 125000 0.1281 0.2158 0.8515 -
7.1990 125500 0.1222 0.2134 0.8508 -
7.2277 126000 0.1316 0.2157 0.8512 -
7.2564 126500 0.1222 0.2151 0.8512 -
7.2850 127000 0.125 0.2147 0.8502 -
7.3137 127500 0.1271 0.2146 0.8508 -
7.3424 128000 0.1259 0.2167 0.8508 -
7.3711 128500 0.1318 0.2151 0.8525 -
7.3998 129000 0.1259 0.2182 0.8511 -
7.4284 129500 0.1249 0.2137 0.8541 -
7.4571 130000 0.1278 0.2177 0.8500 -
7.4858 130500 0.1246 0.2130 0.8507 -
7.5145 131000 0.1308 0.2107 0.8542 -
7.5432 131500 0.124 0.2107 0.8527 -
7.5718 132000 0.1196 0.2141 0.8505 -
7.6005 132500 0.1246 0.2116 0.8516 -
7.6292 133000 0.1269 0.2101 0.8534 -
7.6579 133500 0.1239 0.2110 0.8522 -
7.6866 134000 0.1344 0.2117 0.8512 -
7.7153 134500 0.129 0.2110 0.8515 -
7.7439 135000 0.1248 0.2113 0.8503 -
7.7726 135500 0.1261 0.2121 0.8501 -
7.8013 136000 0.1223 0.2110 0.8486 -
7.8300 136500 0.1236 0.2091 0.8494 -
7.8587 137000 0.1211 0.2084 0.8498 -
7.8873 137500 0.1187 0.2112 0.8473 -
7.9160 138000 0.1242 0.2090 0.8510 -
7.9447 138500 0.1206 0.2096 0.8503 -
7.9734 139000 0.1187 0.2125 0.8523 -
8.0021 139500 0.1242 0.2105 0.8497 -
8.0307 140000 0.1128 0.2134 0.8518 -
8.0594 140500 0.1188 0.2121 0.8509 -
8.0881 141000 0.1151 0.2133 0.8510 -
8.1168 141500 0.1213 0.2113 0.8508 -
8.1455 142000 0.1149 0.2126 0.8502 -
8.1742 142500 0.1126 0.2132 0.8528 -
8.2028 143000 0.1158 0.2129 0.8516 -
8.2315 143500 0.1188 0.2118 0.8510 -
8.2602 144000 0.1219 0.2115 0.8519 -
8.2889 144500 0.1184 0.2106 0.8522 -
8.3176 145000 0.1156 0.2102 0.8518 -
8.3462 145500 0.1125 0.2118 0.8513 -
8.3749 146000 0.1128 0.2103 0.8510 -
8.4036 146500 0.1118 0.2096 0.8502 -
8.4323 147000 0.1158 0.2080 0.8493 -
8.4610 147500 0.1172 0.2100 0.8477 -
8.4896 148000 0.1182 0.2153 0.8465 -
8.5183 148500 0.1173 0.2124 0.8483 -
8.5470 149000 0.1153 0.2118 0.8496 -
8.5757 149500 0.1172 0.2116 0.8487 -
8.6044 150000 0.114 0.2094 0.8516 -
8.6331 150500 0.1188 0.2117 0.8497 -
8.6617 151000 0.116 0.2128 0.8503 -
8.6904 151500 0.1152 0.2118 0.8505 -
8.7191 152000 0.1148 0.2147 0.8511 -
8.7478 152500 0.1136 0.2109 0.8514 -
8.7765 153000 0.1096 0.2104 0.8503 -
8.8051 153500 0.1151 0.2102 0.8509 -
8.8338 154000 0.1184 0.2129 0.8504 -
8.8625 154500 0.1156 0.2143 0.8503 -
8.8912 155000 0.1138 0.2121 0.8530 -
8.9199 155500 0.1144 0.2124 0.8527 -
8.9485 156000 0.1189 0.2136 0.8511 -
8.9772 156500 0.116 0.2139 0.8501 -
9.0059 157000 0.1132 0.2139 0.8495 -
9.0346 157500 0.1063 0.2135 0.8515 -
9.0633 158000 0.1069 0.2142 0.8501 -
9.0920 158500 0.0992 0.2163 0.8497 -
9.1206 159000 0.0995 0.2133 0.8534 -
9.1493 159500 0.1047 0.2144 0.8509 -
9.1780 160000 0.1074 0.2137 0.8516 -
9.2067 160500 0.1084 0.2169 0.8498 -
9.2354 161000 0.1081 0.2143 0.8505 -
9.2640 161500 0.1048 0.2141 0.8520 -
9.2927 162000 0.1055 0.2162 0.8489 -
9.3214 162500 0.101 0.2163 0.8485 -
9.3501 163000 0.1036 0.2153 0.8481 -
9.3788 163500 0.1057 0.2153 0.8489 -
9.4074 164000 0.1075 0.2150 0.8493 -
9.4361 164500 0.1055 0.2159 0.8500 -
9.4648 165000 0.1043 0.2152 0.8503 -
9.4935 165500 0.1072 0.2161 0.8513 -
9.5222 166000 0.1041 0.2154 0.8505 -
9.5509 166500 0.1064 0.2165 0.8505 -
9.5795 167000 0.1062 0.2168 0.8511 -
9.6082 167500 0.1046 0.2163 0.8494 -
9.6369 168000 0.1064 0.2167 0.8492 -
9.6656 168500 0.1017 0.2167 0.8499 -
9.6943 169000 0.1015 0.2143 0.8485 -
9.7229 169500 0.1075 0.2163 0.8477 -
9.7516 170000 0.1032 0.2176 0.8463 -
9.7803 170500 0.1115 0.2158 0.8471 -
9.8090 171000 0.1073 0.2127 0.8475 -
9.8377 171500 0.1056 0.2145 0.8481 -
9.8663 172000 0.108 0.2150 0.8488 -
9.8950 172500 0.1089 0.2142 0.8482 -
9.9237 173000 0.1037 0.2152 0.8475 -
9.9524 173500 0.1062 0.2137 0.8474 -
9.9811 174000 0.1058 0.2159 0.8482 -
10.0098 174500 0.101 0.2132 0.8481 -
10.0384 175000 0.1005 0.2152 0.8505 -
10.0671 175500 0.0963 0.2166 0.8475 -
10.0958 176000 0.1014 0.2150 0.8471 -
10.1245 176500 0.1012 0.2165 0.8467 -
10.1532 177000 0.1041 0.2144 0.8475 -
10.1818 177500 0.0982 0.2129 0.8483 -
10.2105 178000 0.098 0.2175 0.8463 -
10.2392 178500 0.0984 0.2127 0.8492 -
10.2679 179000 0.1 0.2148 0.8464 -
10.2966 179500 0.0985 0.2125 0.8468 -
10.3252 180000 0.1032 0.2102 0.8480 -
10.3539 180500 0.1019 0.2155 0.8455 -
10.3826 181000 0.1025 0.2105 0.8485 -
10.4113 181500 0.0987 0.2144 0.8471 -
10.4400 182000 0.1016 0.2142 0.8453 -
10.4687 182500 0.0981 0.2154 0.8466 -
10.4973 183000 0.0971 0.2150 0.8463 -
10.5260 183500 0.098 0.2136 0.8467 -
10.5547 184000 0.0995 0.2150 0.8469 -
10.5834 184500 0.0993 0.2134 0.8491 -
10.6121 185000 0.0983 0.2128 0.8483 -
10.6407 185500 0.1033 0.2143 0.8475 -
10.6694 186000 0.094 0.2138 0.8484 -
10.6981 186500 0.1026 0.2136 0.8477 -
10.7268 187000 0.1012 0.2140 0.8486 -
10.7555 187500 0.0926 0.2161 0.8481 -
10.7841 188000 0.1041 0.2134 0.8482 -
10.8128 188500 0.094 0.2150 0.8471 -
10.8415 189000 0.104 0.2157 0.8467 -
10.8702 189500 0.1015 0.2139 0.8472 -
10.8989 190000 0.0942 0.2173 0.8473 -
10.9276 190500 0.1002 0.2168 0.8471 -
10.9562 191000 0.1038 0.2169 0.8472 -
10.9849 191500 0.1026 0.2157 0.8463 -
11.0136 192000 0.0975 0.2161 0.8471 -
11.0423 192500 0.0918 0.2146 0.8476 -
11.0710 193000 0.0962 0.2172 0.8469 -
11.0996 193500 0.0928 0.2172 0.8472 -
11.1283 194000 0.0936 0.2165 0.8478 -
11.1570 194500 0.0875 0.2191 0.8472 -
11.1857 195000 0.0997 0.2190 0.8478 -
11.2144 195500 0.0937 0.2215 0.8455 -
11.2430 196000 0.0971 0.2168 0.8458 -
11.2717 196500 0.0963 0.2170 0.8456 -
11.3004 197000 0.0922 0.2183 0.8463 -
11.3291 197500 0.0946 0.2175 0.8448 -
11.3578 198000 0.0976 0.2172 0.8445 -
11.3865 198500 0.0918 0.2171 0.8457 -
11.4151 199000 0.1029 0.2165 0.8459 -
11.4438 199500 0.0949 0.2154 0.8475 -
11.4725 200000 0.0937 0.2172 0.8446 -
11.5012 200500 0.096 0.2181 0.8459 -
11.5299 201000 0.0957 0.2190 0.8451 -
11.5585 201500 0.0988 0.2164 0.8455 -
11.5872 202000 0.0966 0.2166 0.8443 -
11.6159 202500 0.0922 0.2168 0.8440 -
11.6446 203000 0.0914 0.2167 0.8452 -
11.6733 203500 0.0935 0.2153 0.8455 -
11.7019 204000 0.0946 0.2161 0.8455 -
11.7306 204500 0.0969 0.2159 0.8465 -
11.7593 205000 0.0956 0.2166 0.8448 -
11.7880 205500 0.0892 0.2150 0.8455 -
11.8167 206000 0.0919 0.2162 0.8459 -
11.8454 206500 0.0975 0.2162 0.8464 -
11.8740 207000 0.0925 0.2169 0.8454 -
11.9027 207500 0.0883 0.2169 0.8459 -
11.9314 208000 0.0957 0.2160 0.8468 -
11.9601 208500 0.0941 0.2162 0.8471 -
11.9888 209000 0.0924 0.2175 0.8465 -
12.0174 209500 0.0895 0.2159 0.8469 -
12.0461 210000 0.0877 0.2168 0.8457 -
12.0748 210500 0.0908 0.2162 0.8460 -
12.1035 211000 0.0907 0.2174 0.8461 -
12.1322 211500 0.0893 0.2179 0.8450 -
12.1608 212000 0.0868 0.2180 0.8452 -
12.1895 212500 0.0924 0.2180 0.8466 -
12.2182 213000 0.0888 0.2167 0.8462 -
12.2469 213500 0.0846 0.2167 0.8452 -
12.2756 214000 0.0921 0.2166 0.8458 -
12.3043 214500 0.0854 0.2176 0.8456 -
12.3329 215000 0.0877 0.2154 0.8452 -
12.3616 215500 0.0933 0.2162 0.8455 -
12.3903 216000 0.0849 0.2191 0.8449 -
12.4190 216500 0.0889 0.2195 0.8437 -
12.4477 217000 0.0886 0.2183 0.8449 -
12.4763 217500 0.0907 0.2175 0.8464 -
12.5050 218000 0.0915 0.2171 0.8458 -
12.5337 218500 0.0908 0.2180 0.8465 -
12.5624 219000 0.0863 0.2192 0.8450 -
12.5911 219500 0.086 0.2190 0.8460 -
12.6197 220000 0.0944 0.2190 0.8462 -
12.6484 220500 0.0858 0.2186 0.8459 -
12.6771 221000 0.0918 0.2176 0.8466 -
12.7058 221500 0.0934 0.2185 0.8468 -
12.7345 222000 0.0903 0.2182 0.8472 -
12.7632 222500 0.0858 0.2179 0.8467 -
12.7918 223000 0.0941 0.2188 0.8461 -
12.8205 223500 0.0867 0.2170 0.8464 -
12.8492 224000 0.0881 0.2173 0.8471 -
12.8779 224500 0.0869 0.2185 0.8464 -
12.9066 225000 0.0933 0.2181 0.8467 -
12.9352 225500 0.0923 0.2177 0.8464 -
12.9639 226000 0.0887 0.2175 0.8471 -
12.9926 226500 0.0958 0.2180 0.8472 -
13.0213 227000 0.085 0.2181 0.8463 -
13.0500 227500 0.0818 0.2169 0.8467 -
13.0786 228000 0.0876 0.2186 0.8459 -
13.1073 228500 0.0913 0.2190 0.8453 -
13.1360 229000 0.0853 0.2193 0.8454 -
13.1647 229500 0.0886 0.2199 0.8463 -
13.1934 230000 0.085 0.2202 0.8465 -
13.2221 230500 0.0879 0.2222 0.8453 -
13.2507 231000 0.0853 0.2208 0.8457 -
13.2794 231500 0.0831 0.2192 0.8459 -
13.3081 232000 0.0865 0.2202 0.8455 -
13.3368 232500 0.091 0.2209 0.8449 -
13.3655 233000 0.0848 0.2193 0.8454 -
13.3941 233500 0.0873 0.2192 0.8453 -
13.4228 234000 0.0813 0.2197 0.8449 -
13.4515 234500 0.0883 0.2205 0.8447 -
13.4802 235000 0.0858 0.2193 0.8463 -
13.5089 235500 0.0902 0.2197 0.8466 -
13.5375 236000 0.0837 0.2185 0.8469 -
13.5662 236500 0.0922 0.2201 0.8462 -
13.5949 237000 0.0876 0.2197 0.8463 -
13.6236 237500 0.0839 0.2191 0.8458 -
13.6523 238000 0.0878 0.2197 0.8454 -
13.6809 238500 0.0874 0.2197 0.8451 -
13.7096 239000 0.0848 0.2198 0.8457 -
13.7383 239500 0.0842 0.2185 0.8459 -
13.7670 240000 0.0827 0.2184 0.8463 -
13.7957 240500 0.0885 0.2176 0.8458 -
13.8244 241000 0.0872 0.2180 0.8462 -
13.8530 241500 0.0856 0.2180 0.8468 -
13.8817 242000 0.0887 0.2184 0.8459 -
13.9104 242500 0.0875 0.2187 0.8461 -
13.9391 243000 0.0857 0.2195 0.8460 -
13.9678 243500 0.0845 0.2188 0.8467 -
13.9964 244000 0.0896 0.2184 0.8463 -
14.0251 244500 0.0818 0.2189 0.8467 -
14.0538 245000 0.09 0.2194 0.8460 -
14.0825 245500 0.0842 0.2190 0.8456 -
14.1112 246000 0.0878 0.2190 0.8460 -
14.1398 246500 0.0838 0.2195 0.8462 -
14.1685 247000 0.0781 0.2201 0.8460 -
14.1972 247500 0.0847 0.2193 0.8466 -
14.2259 248000 0.0881 0.2188 0.8470 -
14.2546 248500 0.082 0.2184 0.8473 -
14.2833 249000 0.0886 0.2191 0.8469 -
14.3119 249500 0.0874 0.2195 0.8470 -
14.3406 250000 0.0833 0.2197 0.8465 -
14.3693 250500 0.0856 0.2197 0.8461 -
14.3980 251000 0.0834 0.2198 0.8464 -
14.4267 251500 0.0852 0.2199 0.8461 -
14.4553 252000 0.0853 0.2201 0.8456 -
14.4840 252500 0.0811 0.2197 0.8461 -
14.5127 253000 0.0778 0.2195 0.8464 -
14.5414 253500 0.0837 0.2200 0.8462 -
14.5701 254000 0.0835 0.2203 0.8459 -
14.5987 254500 0.0854 0.2199 0.8462 -
14.6274 255000 0.0877 0.2196 0.8464 -
14.6561 255500 0.0826 0.2198 0.8463 -
14.6848 256000 0.0894 0.2197 0.8463 -
14.7135 256500 0.0873 0.2199 0.8462 -
14.7422 257000 0.0818 0.2197 0.8462 -
14.7708 257500 0.0854 0.2196 0.8464 -
14.7995 258000 0.0823 0.2195 0.8464 -
14.8282 258500 0.0769 0.2194 0.8465 -
14.8569 259000 0.0842 0.2195 0.8465 -
14.8856 259500 0.0848 0.2195 0.8464 -
14.9142 260000 0.0839 0.2196 0.8464 -
14.9429 260500 0.089 0.2196 0.8464 -
14.9716 261000 0.0881 0.2197 0.8464 -
-1 -1 - - - 0.8431

Framework Versions

  • Python: 3.13.0
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.1
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.11.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
26
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sobamchan/t5-base-no-mrl

Base model

google-t5/t5-base
Finetuned
(724)
this model

Dataset used to train sobamchan/t5-base-no-mrl

Papers for sobamchan/t5-base-no-mrl

Evaluation results