thxCode commited on
Commit
2e8b42c
·
0 Parent(s):

feat: first commit

Browse files

Signed-off-by: thxCode <thxcode0824@gmail.com>

.gitattributes ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,397 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-classification
4
+ tags:
5
+ - transformers
6
+ - sentence-transformers
7
+ - text-embeddings-inference
8
+ language:
9
+ - multilingual
10
+ ---
11
+
12
+ # bge-reranker-v2-m3-GGUF
13
+
14
+ **Model creator**: [BAAI](https://huggingface.co/BAAI)
15
+ **Original model**: [bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3)
16
+ **GGUF quantization**: based on llama.cpp release [f4d2b](https://github.com/ggerganov/llama.cpp/commit/f4d2b8846a6b34419ff9e9491aee6cd95e444bfc)
17
+
18
+ ---
19
+
20
+ # Reranker
21
+
22
+ **More details please refer to our Github: [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding/tree/master).**
23
+
24
+ - [Model List](#model-list)
25
+ - [Usage](#usage)
26
+ - [Fine-tuning](#fine-tune)
27
+ - [Evaluation](#evaluation)
28
+ - [Citation](#citation)
29
+
30
+ Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding.
31
+ You can get a relevance score by inputting query and passage to the reranker.
32
+ And the score can be mapped to a float value in [0,1] by sigmoid function.
33
+
34
+
35
+ ## Model List
36
+
37
+ | Model | Base model | Language | layerwise | feature |
38
+ |:--------------------------------------------------------------------------|:--------:|:-----------------------------------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------:|
39
+ | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) | Chinese and English | - | Lightweight reranker model, easy to deploy, with fast inference. |
40
+ | [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | [xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large) | Chinese and English | - | Lightweight reranker model, easy to deploy, with fast inference. |
41
+ | [BAAI/bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3) | [bge-m3](https://huggingface.co/BAAI/bge-m3) | Multilingual | - | Lightweight reranker model, possesses strong multilingual capabilities, easy to deploy, with fast inference. |
42
+ | [BAAI/bge-reranker-v2-gemma](https://huggingface.co/BAAI/bge-reranker-v2-gemma) | [gemma-2b](https://huggingface.co/google/gemma-2b) | Multilingual | - | Suitable for multilingual contexts, performs well in both English proficiency and multilingual capabilities. |
43
+ | [BAAI/bge-reranker-v2-minicpm-layerwise](https://huggingface.co/BAAI/bge-reranker-v2-minicpm-layerwise) | [MiniCPM-2B-dpo-bf16](https://huggingface.co/openbmb/MiniCPM-2B-dpo-bf16) | Multilingual | 8-40 | Suitable for multilingual contexts, performs well in both English and Chinese proficiency, allows freedom to select layers for output, facilitating accelerated inference. |
44
+
45
+
46
+ You can select the model according your senario and resource.
47
+ - For **multilingual**, utilize [BAAI/bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3) and [BAAI/bge-reranker-v2-gemma](https://huggingface.co/BAAI/bge-reranker-v2-gemma)
48
+
49
+ - For **Chinese or English**, utilize [BAAI/bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3) and [BAAI/bge-reranker-v2-minicpm-layerwise](https://huggingface.co/BAAI/bge-reranker-v2-minicpm-layerwise).
50
+
51
+ - For **efficiency**, utilize [BAAI/bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3) and the low layer of [BAAI/bge-reranker-v2-minicpm-layerwise](https://huggingface.co/BAAI/bge-reranker-v2-minicpm-layerwise).
52
+
53
+ - For better performance, recommand [BAAI/bge-reranker-v2-minicpm-layerwise](https://huggingface.co/BAAI/bge-reranker-v2-minicpm-layerwise) and [BAAI/bge-reranker-v2-gemma](https://huggingface.co/BAAI/bge-reranker-v2-gemma)
54
+
55
+ ## Usage
56
+ ### Using FlagEmbedding
57
+
58
+ ```
59
+ pip install -U FlagEmbedding
60
+ ```
61
+
62
+ #### For normal reranker (bge-reranker-base / bge-reranker-large / bge-reranker-v2-m3 )
63
+
64
+ Get relevance scores (higher scores indicate more relevance):
65
+
66
+ ```python
67
+ from FlagEmbedding import FlagReranker
68
+ reranker = FlagReranker('BAAI/bge-reranker-v2-m3', use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation
69
+
70
+ score = reranker.compute_score(['query', 'passage'])
71
+ print(score) # -5.65234375
72
+
73
+ # You can map the scores into 0-1 by set "normalize=True", which will apply sigmoid function to the score
74
+ score = reranker.compute_score(['query', 'passage'], normalize=True)
75
+ print(score) # 0.003497010252573502
76
+
77
+ scores = reranker.compute_score([['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']])
78
+ print(scores) # [-8.1875, 5.26171875]
79
+
80
+ # You can map the scores into 0-1 by set "normalize=True", which will apply sigmoid function to the score
81
+ scores = reranker.compute_score([['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']], normalize=True)
82
+ print(scores) # [0.00027803096387751553, 0.9948403768236574]
83
+ ```
84
+
85
+ #### For LLM-based reranker
86
+
87
+ ```python
88
+ from FlagEmbedding import FlagLLMReranker
89
+ reranker = FlagLLMReranker('BAAI/bge-reranker-v2-gemma', use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation
90
+ # reranker = FlagLLMReranker('BAAI/bge-reranker-v2-gemma', use_bf16=True) # You can also set use_bf16=True to speed up computation with a slight performance degradation
91
+
92
+ score = reranker.compute_score(['query', 'passage'])
93
+ print(score)
94
+
95
+ scores = reranker.compute_score([['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']])
96
+ print(scores)
97
+ ```
98
+
99
+ #### For LLM-based layerwise reranker
100
+
101
+ ```python
102
+ from FlagEmbedding import LayerWiseFlagLLMReranker
103
+ reranker = LayerWiseFlagLLMReranker('BAAI/bge-reranker-v2-minicpm-layerwise', use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation
104
+ # reranker = LayerWiseFlagLLMReranker('BAAI/bge-reranker-v2-minicpm-layerwise', use_bf16=True) # You can also set use_bf16=True to speed up computation with a slight performance degradation
105
+
106
+ score = reranker.compute_score(['query', 'passage'], cutoff_layers=[28]) # Adjusting 'cutoff_layers' to pick which layers are used for computing the score.
107
+ print(score)
108
+
109
+ scores = reranker.compute_score([['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']], cutoff_layers=[28])
110
+ print(scores)
111
+ ```
112
+
113
+ ### Using Huggingface transformers
114
+
115
+ #### For normal reranker (bge-reranker-base / bge-reranker-large / bge-reranker-v2-m3 )
116
+
117
+ Get relevance scores (higher scores indicate more relevance):
118
+
119
+ ```python
120
+ import torch
121
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
122
+
123
+ tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-reranker-v2-m3')
124
+ model = AutoModelForSequenceClassification.from_pretrained('BAAI/bge-reranker-v2-m3')
125
+ model.eval()
126
+
127
+ pairs = [['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']]
128
+ with torch.no_grad():
129
+ inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
130
+ scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
131
+ print(scores)
132
+ ```
133
+
134
+ #### For LLM-based reranker
135
+
136
+ ```python
137
+ import torch
138
+ from transformers import AutoModelForCausalLM, AutoTokenizer
139
+
140
+ def get_inputs(pairs, tokenizer, prompt=None, max_length=1024):
141
+ if prompt is None:
142
+ prompt = "Given a query A and a passage B, determine whether the passage contains an answer to the query by providing a prediction of either 'Yes' or 'No'."
143
+ sep = "\n"
144
+ prompt_inputs = tokenizer(prompt,
145
+ return_tensors=None,
146
+ add_special_tokens=False)['input_ids']
147
+ sep_inputs = tokenizer(sep,
148
+ return_tensors=None,
149
+ add_special_tokens=False)['input_ids']
150
+ inputs = []
151
+ for query, passage in pairs:
152
+ query_inputs = tokenizer(f'A: {query}',
153
+ return_tensors=None,
154
+ add_special_tokens=False,
155
+ max_length=max_length * 3 // 4,
156
+ truncation=True)
157
+ passage_inputs = tokenizer(f'B: {passage}',
158
+ return_tensors=None,
159
+ add_special_tokens=False,
160
+ max_length=max_length,
161
+ truncation=True)
162
+ item = tokenizer.prepare_for_model(
163
+ [tokenizer.bos_token_id] + query_inputs['input_ids'],
164
+ sep_inputs + passage_inputs['input_ids'],
165
+ truncation='only_second',
166
+ max_length=max_length,
167
+ padding=False,
168
+ return_attention_mask=False,
169
+ return_token_type_ids=False,
170
+ add_special_tokens=False
171
+ )
172
+ item['input_ids'] = item['input_ids'] + sep_inputs + prompt_inputs
173
+ item['attention_mask'] = [1] * len(item['input_ids'])
174
+ inputs.append(item)
175
+ return tokenizer.pad(
176
+ inputs,
177
+ padding=True,
178
+ max_length=max_length + len(sep_inputs) + len(prompt_inputs),
179
+ pad_to_multiple_of=8,
180
+ return_tensors='pt',
181
+ )
182
+
183
+ tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-reranker-v2-gemma')
184
+ model = AutoModelForCausalLM.from_pretrained('BAAI/bge-reranker-v2-gemma')
185
+ yes_loc = tokenizer('Yes', add_special_tokens=False)['input_ids'][0]
186
+ model.eval()
187
+
188
+ pairs = [['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']]
189
+ with torch.no_grad():
190
+ inputs = get_inputs(pairs, tokenizer)
191
+ scores = model(**inputs, return_dict=True).logits[:, -1, yes_loc].view(-1, ).float()
192
+ print(scores)
193
+ ```
194
+
195
+ #### For LLM-based layerwise reranker
196
+
197
+ ```python
198
+ import torch
199
+ from transformers import AutoModelForCausalLM, AutoTokenizer
200
+
201
+ def get_inputs(pairs, tokenizer, prompt=None, max_length=1024):
202
+ if prompt is None:
203
+ prompt = "Given a query A and a passage B, determine whether the passage contains an answer to the query by providing a prediction of either 'Yes' or 'No'."
204
+ sep = "\n"
205
+ prompt_inputs = tokenizer(prompt,
206
+ return_tensors=None,
207
+ add_special_tokens=False)['input_ids']
208
+ sep_inputs = tokenizer(sep,
209
+ return_tensors=None,
210
+ add_special_tokens=False)['input_ids']
211
+ inputs = []
212
+ for query, passage in pairs:
213
+ query_inputs = tokenizer(f'A: {query}',
214
+ return_tensors=None,
215
+ add_special_tokens=False,
216
+ max_length=max_length * 3 // 4,
217
+ truncation=True)
218
+ passage_inputs = tokenizer(f'B: {passage}',
219
+ return_tensors=None,
220
+ add_special_tokens=False,
221
+ max_length=max_length,
222
+ truncation=True)
223
+ item = tokenizer.prepare_for_model(
224
+ [tokenizer.bos_token_id] + query_inputs['input_ids'],
225
+ sep_inputs + passage_inputs['input_ids'],
226
+ truncation='only_second',
227
+ max_length=max_length,
228
+ padding=False,
229
+ return_attention_mask=False,
230
+ return_token_type_ids=False,
231
+ add_special_tokens=False
232
+ )
233
+ item['input_ids'] = item['input_ids'] + sep_inputs + prompt_inputs
234
+ item['attention_mask'] = [1] * len(item['input_ids'])
235
+ inputs.append(item)
236
+ return tokenizer.pad(
237
+ inputs,
238
+ padding=True,
239
+ max_length=max_length + len(sep_inputs) + len(prompt_inputs),
240
+ pad_to_multiple_of=8,
241
+ return_tensors='pt',
242
+ )
243
+
244
+ tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-reranker-v2-minicpm-layerwise', trust_remote_code=True)
245
+ model = AutoModelForCausalLM.from_pretrained('BAAI/bge-reranker-v2-minicpm-layerwise', trust_remote_code=True, torch_dtype=torch.bfloat16)
246
+ model = model.to('cuda')
247
+ model.eval()
248
+
249
+ pairs = [['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']]
250
+ with torch.no_grad():
251
+ inputs = get_inputs(pairs, tokenizer).to(model.device)
252
+ all_scores = model(**inputs, return_dict=True, cutoff_layers=[28])
253
+ all_scores = [scores[:, -1].view(-1, ).float() for scores in all_scores[0]]
254
+ print(all_scores)
255
+ ```
256
+
257
+ ## Fine-tune
258
+
259
+ ### Data Format
260
+
261
+ Train data should be a json file, where each line is a dict like this:
262
+
263
+ ```
264
+ {"query": str, "pos": List[str], "neg":List[str], "prompt": str}
265
+ ```
266
+
267
+ `query` is the query, and `pos` is a list of positive texts, `neg` is a list of negative texts, `prompt` indicates the relationship between query and texts. If you have no negative texts for a query, you can random sample some from the entire corpus as the negatives.
268
+
269
+ See [toy_finetune_data.jsonl](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_reranker/toy_finetune_data.jsonl) for a toy data file.
270
+
271
+ ### Train
272
+
273
+ You can fine-tune the reranker with the following code:
274
+
275
+ **For llm-based reranker**
276
+
277
+ ```shell
278
+ torchrun --nproc_per_node {number of gpus} \
279
+ -m FlagEmbedding.llm_reranker.finetune_for_instruction.run \
280
+ --output_dir {path to save model} \
281
+ --model_name_or_path google/gemma-2b \
282
+ --train_data ./toy_finetune_data.jsonl \
283
+ --learning_rate 2e-4 \
284
+ --num_train_epochs 1 \
285
+ --per_device_train_batch_size 1 \
286
+ --gradient_accumulation_steps 16 \
287
+ --dataloader_drop_last True \
288
+ --query_max_len 512 \
289
+ --passage_max_len 512 \
290
+ --train_group_size 16 \
291
+ --logging_steps 1 \
292
+ --save_steps 2000 \
293
+ --save_total_limit 50 \
294
+ --ddp_find_unused_parameters False \
295
+ --gradient_checkpointing \
296
+ --deepspeed stage1.json \
297
+ --warmup_ratio 0.1 \
298
+ --bf16 \
299
+ --use_lora True \
300
+ --lora_rank 32 \
301
+ --lora_alpha 64 \
302
+ --use_flash_attn True \
303
+ --target_modules q_proj k_proj v_proj o_proj
304
+ ```
305
+
306
+ **For llm-based layerwise reranker**
307
+
308
+ ```shell
309
+ torchrun --nproc_per_node {number of gpus} \
310
+ -m FlagEmbedding.llm_reranker.finetune_for_layerwise.run \
311
+ --output_dir {path to save model} \
312
+ --model_name_or_path openbmb/MiniCPM-2B-dpo-bf16 \
313
+ --train_data ./toy_finetune_data.jsonl \
314
+ --learning_rate 2e-4 \
315
+ --num_train_epochs 1 \
316
+ --per_device_train_batch_size 1 \
317
+ --gradient_accumulation_steps 16 \
318
+ --dataloader_drop_last True \
319
+ --query_max_len 512 \
320
+ --passage_max_len 512 \
321
+ --train_group_size 16 \
322
+ --logging_steps 1 \
323
+ --save_steps 2000 \
324
+ --save_total_limit 50 \
325
+ --ddp_find_unused_parameters False \
326
+ --gradient_checkpointing \
327
+ --deepspeed stage1.json \
328
+ --warmup_ratio 0.1 \
329
+ --bf16 \
330
+ --use_lora True \
331
+ --lora_rank 32 \
332
+ --lora_alpha 64 \
333
+ --use_flash_attn True \
334
+ --target_modules q_proj k_proj v_proj o_proj \
335
+ --start_layer 8 \
336
+ --head_multi True \
337
+ --head_type simple \
338
+ --lora_extra_parameters linear_head
339
+ ```
340
+
341
+ Our rerankers are initialized from [google/gemma-2b](https://huggingface.co/google/gemma-2b) (for llm-based reranker) and [openbmb/MiniCPM-2B-dpo-bf16](https://huggingface.co/openbmb/MiniCPM-2B-dpo-bf16) (for llm-based layerwise reranker), and we train it on a mixture of multilingual datasets:
342
+
343
+ - [bge-m3-data](https://huggingface.co/datasets/Shitao/bge-m3-data)
344
+ - [quora train data](https://huggingface.co/datasets/quora)
345
+ - [fever train data](https://fever.ai/dataset/fever.html)
346
+
347
+ ## Evaluation
348
+
349
+ - llama-index.
350
+
351
+ ![image-20240317193909373](./assets/llama-index.png)
352
+
353
+
354
+ - BEIR.
355
+
356
+ rereank the top 100 results from bge-en-v1.5 large.
357
+
358
+ ![image-20240317174633333](./assets/BEIR-bge-en-v1.5.png)
359
+
360
+ rereank the top 100 results from e5 mistral 7b instruct.
361
+
362
+ ![image-20240317172949713](./assets/BEIR-e5-mistral.png)
363
+
364
+ - CMTEB-retrieval.
365
+ It rereank the top 100 results from bge-zh-v1.5 large.
366
+
367
+ ![image-20240317173026235](./assets/CMTEB-retrieval-bge-zh-v1.5.png)
368
+
369
+ - miracl (multi-language).
370
+ It rereank the top 100 results from bge-m3.
371
+
372
+ ![image-20240317173117639](./assets/miracl-bge-m3.png)
373
+
374
+
375
+
376
+ ## Citation
377
+
378
+ If you find this repository useful, please consider giving a star and citation
379
+
380
+ ```bibtex
381
+ @misc{li2023making,
382
+ title={Making Large Language Models A Better Foundation For Dense Retrieval},
383
+ author={Chaofan Li and Zheng Liu and Shitao Xiao and Yingxia Shao},
384
+ year={2023},
385
+ eprint={2312.15503},
386
+ archivePrefix={arXiv},
387
+ primaryClass={cs.CL}
388
+ }
389
+ @misc{chen2024bge,
390
+ title={BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation},
391
+ author={Jianlv Chen and Shitao Xiao and Peitian Zhang and Kun Luo and Defu Lian and Zheng Liu},
392
+ year={2024},
393
+ eprint={2402.03216},
394
+ archivePrefix={arXiv},
395
+ primaryClass={cs.CL}
396
+ }
397
+ ```
assets/BEIR-bge-en-v1.5.png ADDED
assets/BEIR-e5-mistral.png ADDED
assets/CMTEB-retrieval-bge-zh-v1.5.png ADDED
assets/llama-index.png ADDED
assets/miracl-bge-m3.png ADDED
bge-reranker-v2-m3-FP16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5df93be121c09c43432102ad2b9569d369ccb85c209ca7583e8ccd28f0e41b88
3
+ size 1159776896
bge-reranker-v2-m3-Q2_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f12135b80de836cbf94c1169dc8efda57c81040c1dfd9dedc20709d2e1725e39
3
+ size 366467488
bge-reranker-v2-m3-Q3_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b9a8990c7ebb14511aecbc9df90d1e956132be64d5ce62783431fa1abb339a3
3
+ size 402749856
bge-reranker-v2-m3-Q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6d9e23b0765d4e250b697657ddaef1ca0b1fd7d150d910940eb35f673590e267
3
+ size 422156704
bge-reranker-v2-m3-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e186a244ed455b4ab66ec64339ce7427a6ae13f5c0b5e544de96e50f0f8b3673
3
+ size 438376864
bge-reranker-v2-m3-Q5_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e232e93920b96cb144370deac915e00b8f2377ced7c40ee3ac7f6881a0f9f987
3
+ size 460036512
bge-reranker-v2-m3-Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a212007526c7083627eed92b39dd4472e90ff1374a03fb068733378220813ef
3
+ size 468392352
bge-reranker-v2-m3-Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc99d3e11bedd7f032e227f86217fc8bfc01dd7476ac4021742bcf8babc9a83d
3
+ size 500283808
bge-reranker-v2-m3-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a43c7c9b11a4c1517e5bf95151960e1621d1b72f7a493364b01e386cf1aaa1d3
3
+ size 635676416