Mingke977 commited on
Commit
c2616ba
·
verified ·
1 Parent(s): facd22e

Add files using upload-large-folder tool

Browse files
Files changed (1) hide show
  1. README.md +404 -45
README.md CHANGED
@@ -1,49 +1,408 @@
1
  ---
2
- base_model: []
 
 
 
3
  library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
  ---
9
- # c362_step50_ta05
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
-
16
- This model was merged using the [Linear DARE](https://arxiv.org/abs/2311.03099) merge method using /root/myCodeLab/host/downloads/models/40Bra as a base.
17
-
18
- ### Models Merged
19
-
20
- The following models were included in the merge:
21
- * /root/myCodeLab/host/verl/ckpts/40bra_k8s_single_domain/40bra_k8s_16node_sd_c362_20260327_205644_unknown/global_step_50/actor/huggingface
22
-
23
- ### Configuration
24
-
25
- The following YAML configuration was used to produce this model:
26
-
27
- ```yaml
28
- base_model: /root/myCodeLab/host/downloads/models/40Bra
29
- dtype: float32
30
- merge_method: dare_linear
31
- modules:
32
- default:
33
- slices:
34
- - sources:
35
- - layer_range: [0, 40]
36
- model: /root/myCodeLab/host/downloads/models/40Bra
37
- - layer_range: [0, 40]
38
- model: /root/myCodeLab/host/verl/ckpts/40bra_k8s_single_domain/40bra_k8s_16node_sd_c362_20260327_205644_unknown/global_step_50/actor/huggingface
39
- parameters:
40
- density: 1.0
41
- weight:
42
- - filter: .mlp.gate.
43
- value: 0.0
44
- - value: 0.5
45
- - sources:
46
- - layer_range: [40, 41]
47
- model: /root/myCodeLab/host/downloads/models/40Bra
48
- out_dtype: bfloat16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - zh
4
+ - en
5
+ pipeline_tag: text-generation
6
  library_name: transformers
 
 
 
 
7
  ---
8
+ <div align="center">
9
+ <picture>
10
+ <img src="figures/joyai-logo.png" width="30%" alt="JoyAI-LLM Flash">
11
+ </picture>
12
+ </div>
13
+ <hr>
14
+
15
+ <div align="center" style="line-height: 1;">
16
+ <a href="https://huggingface.co/jdopensource" target="_blank"><img
17
+ alt="Hugging Face"
18
+ src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-JD-ffc107?color=ffc107&logoColor=white"/></a>
19
+ <a
20
+ href="https://huggingface.co/jdopensource/JoyAI-LLM-Flash/blob/main/LICENSE"><img
21
+ alt="License"
22
+ src="https://img.shields.io/badge/License-Modified_MIT-f5de53?&color=f5de53"/></a>
23
+ </div>
24
+
25
+
26
+
27
+
28
+ ## 1. Model Introduction
29
+
30
+ JoyAI-LLM-Flash is a state-of-the-art medium-sized instruct language model with
31
+ 3 billion activated parameters and 48 billion total parameters. JoyAI-LLM-Flash
32
+ was pretrained on 20 trillion text tokens using Muon optimizer, followed by
33
+ large-scale supervised fine-tuning (SFT), direct preference optimization (DPO),
34
+ and reinforcement learning (RL) across diverse environments. JoyAI-LLM-Flash
35
+ achieves strong performance across frontier knowledge, reasoning, coding tasks
36
+ and agentic capabilities.
37
+
38
+ ### Key Features
39
+
40
+ - Fiber Bundle RL: Introduces fiber bundle theory into reinforcement learning,
41
+ proposing a novel optimization framework, FiberPO. This method is
42
+ specifically designed to handle the challenges of large-scale and heterogeneous
43
+ agent training, improving stability and robustness under complex data
44
+ distributions.
45
+ - Training-Inference Collaboration: apply Muon optimizer with dense MTP,
46
+ develop novel optimization techniques to resolve instabilities while scaling
47
+ up, delivering 1.3× to 1.7× the throughput of the non-MTP version.
48
+ - Agentic Intelligence: designed for tool use, reasoning, and autonomous
49
+ problem-solving.
50
+
51
+ ## 2. Model Summary
52
+
53
+ | | |
54
+ | :-----------------------------------------: | :----------------------: |
55
+ | **Architecture** | Mixture-of-Experts (MoE) |
56
+ | **Total Parameters** | 48B |
57
+ | **Activated Parameters** | 3B |
58
+ | **Number of Layers** (Dense layer included) | 40 |
59
+ | **Number of Dense Layers** | 1 |
60
+ | **Attention Hidden Dimension** | 2048 |
61
+ | **MoE Hidden Dimension** (per Expert) | 768 |
62
+ | **Number of Attention Heads** | 32 |
63
+ | **Number of Experts** | 256 |
64
+ | **Selected Experts per Token** | 8 |
65
+ | **Number of Shared Experts** | 1 |
66
+ | **Vocabulary Size** | 129K |
67
+ | **Context Length** | 128K |
68
+ | **Attention Mechanism** | MLA |
69
+ | **Activation Function** | SwiGLU |
70
+ | </div> | |
71
+
72
+
73
+ ## 3. Evaluation Results
74
+
75
+ <table>
76
+ <thead>
77
+ <tr>
78
+ <th align="center">Benchmark</th>
79
+ <th align="center"><sup>JoyAI-LLM Flash</sup></th>
80
+ <th align="center"><sup>Qwen3-30B-A3B-Instuct-2507</sup></th>
81
+ <th align="center"><sup>GLM-4.7-Flash<br>(Non-thinking)</sup></th>
82
+ </tr>
83
+ </thead>
84
+ <tbody>
85
+
86
+
87
+ <tr>
88
+ <td align="center" colspan=8><strong>Knowledge &amp; Alignment</strong></td>
89
+ </tr>
90
+ <tr>
91
+ <td align="center" style="vertical-align: middle">MMLU</td>
92
+ <td align="center" style="vertical-align: middle"><strong>89.50</strong></td>
93
+ <td align="center" style="vertical-align: middle">86.87</td>
94
+ <td align="center" style="vertical-align: middle">80.53</td>
95
+ </tr>
96
+ <tr>
97
+ <td align="center" style="vertical-align: middle">MMLU-Pro</td>
98
+ <td align="center" style="vertical-align: middle"><strong>81.02</strong></td>
99
+ <td align="center" style="vertical-align: middle">73.88</td>
100
+ <td align="center" style="vertical-align: middle">63.62</td>
101
+ </tr>
102
+ <tr>
103
+ <td align="center" style="vertical-align: middle">CMMLU</td>
104
+ <td align="center" style="vertical-align: middle"><strong>87.03</strong></td>
105
+ <td align="center" style="vertical-align: middle">85.88</td>
106
+ <td align="center" style="vertical-align: middle">75.85</td>
107
+ </tr>
108
+ <tr>
109
+ <td align="center" style="vertical-align: middle">GPQA-Diamond</td>
110
+ <td align="center" style="vertical-align: middle"><strong>74.43</strong></td>
111
+ <td align="center" style="vertical-align: middle">68.69</td>
112
+ <td align="center" style="vertical-align: middle">39.90</td>
113
+ </tr>
114
+ <tr>
115
+ <td align="center" style="vertical-align: middle">SuperGPQA</td>
116
+ <td align="center" style="vertical-align: middle"><strong>55.00</strong></td>
117
+ <td align="center" style="vertical-align: middle">52.00</td>
118
+ <td align="center" style="vertical-align: middle">32.00</td>
119
+ </tr>
120
+ <tr>
121
+ <td align="center" style="vertical-align: middle">LiveBench</td>
122
+ <td align="center" style="vertical-align: middle"><strong>72.90</strong></td>
123
+ <td align="center" style="vertical-align: middle">59.70</td>
124
+ <td align="center" style="vertical-align: middle">43.10</td>
125
+ </tr>
126
+ <tr>
127
+ <td align="center" style="vertical-align: middle">IFEval</td>
128
+ <td align="center" style="vertical-align: middle"><strong>86.69</strong></td>
129
+ <td align="center" style="vertical-align: middle">83.18</td>
130
+ <td align="center" style="vertical-align: middle">82.44</td>
131
+ </tr>
132
+ <tr>
133
+ <td align="center" style="vertical-align: middle">AlignBench</td>
134
+ <td align="center" style="vertical-align: middle"><strong>8.24</strong></td>
135
+ <td align="center" style="vertical-align: middle">8.07</td>
136
+ <td align="center" style="vertical-align: middle">6.85</td>
137
+ </tr>
138
+ <tr>
139
+ <td align="center" style="vertical-align: middle">HellaSwag</td>
140
+ <td align="center" style="vertical-align: middle"><strong>91.79</strong></td>
141
+ <td align="center" style="vertical-align: middle">89.90</td>
142
+ <td align="center" style="vertical-align: middle">60.84</td>
143
+ </tr>
144
+
145
+ <tr>
146
+ <td align="center" colspan=8><strong>Coding</strong></td>
147
+ </tr>
148
+ <tr>
149
+ <td align="center" style="vertical-align: middle">HumanEval</td>
150
+ <td align="center" style="vertical-align: middle"><strong>96.34</strong></td>
151
+ <td align="center" style="vertical-align: middle">95.12</td>
152
+ <td align="center" style="vertical-align: middle">74.39</td>
153
+ </tr>
154
+ <tr>
155
+ <td align="center" style="vertical-align: middle">LiveCodeBench</td>
156
+ <td align="center" style="vertical-align: middle"><strong>65.60</strong></td>
157
+ <td align="center" style="vertical-align: middle">39.71</td>
158
+ <td align="center" style="vertical-align: middle">27.43</td>
159
+ </tr>
160
+ <tr>
161
+ <td align="center" style="vertical-align: middle">SciCode</td>
162
+ <td align="center" style="vertical-align:
163
+ middle"><strong>3.08/22.92</strong></td>
164
+ <td align="center" style="vertical-align:
165
+ middle"><strong>3.08/22.92</strong></td>
166
+ <td align="center" style="vertical-align: middle">3.08/15.11</td>
167
+ </tr>
168
+ <tr>
169
+ <td align="center" colspan=8><strong>Mathematics</strong></td>
170
+ </tr>
171
+ <tr>
172
+ <td align="center" style="vertical-align: middle">GSM8K</td>
173
+ <td align="center" style="vertical-align: middle"><strong>95.83</strong></td>
174
+ <td align="center" style="vertical-align: middle">79.83</td>
175
+ <td align="center" style="vertical-align: middle">81.88</td>
176
+ </tr>
177
+ <tr>
178
+ <td align="center" style="vertical-align: middle">AIME2025</td>
179
+ <td align="center" style="vertical-align: middle"><strong>65.83</strong></td>
180
+ <td align="center" style="vertical-align: middle">62.08</td>
181
+ <td align="center" style="vertical-align: middle">24.17</td>
182
+ </tr>
183
+ <tr>
184
+ <td align="center" style="vertical-align: middle">MATH 500</td>
185
+ <td align="center" style="vertical-align: middle"><strong>97.10</strong></td>
186
+ <td align="center" style="vertical-align: middle">89.80</td>
187
+ <td align="center" style="vertical-align: middle">90.90</td>
188
+ </tr>
189
+
190
+ <tr>
191
+ <td align="center" colspan=8><strong>Agentic</strong></td>
192
+ </tr>
193
+ <tr>
194
+ <td align="center" style="vertical-align: middle">SWE-bench Verified</td>
195
+ <td align="center" style="vertical-align: middle"><strong>60.60</strong></td>
196
+ <td align="center" style="vertical-align: middle">24.44</td>
197
+ <td align="center" style="vertical-align: middle">51.60</td>
198
+ </tr>
199
+ <tr>
200
+ <td align="center" style="vertical-align: middle">Tau2-Retail</td>
201
+ <td align="center" style="vertical-align: middle"><strong>67.55</strong></td>
202
+ <td align="center" style="vertical-align: middle">53.51</td>
203
+ <td align="center" style="vertical-align: middle">62.28</td>
204
+ </tr>
205
+ <tr>
206
+ <td align="center" style="vertical-align: middle">Tau2-Airline</td>
207
+ <td align="center" style="vertical-align: middle"><strong>54.00</strong></td>
208
+ <td align="center" style="vertical-align: middle">32.00</td>
209
+ <td align="center" style="vertical-align: middle">52.00</td>
210
+ </tr>
211
+ <tr>
212
+ <td align="center" style="vertical-align: middle">Tau2-Telecom</td>
213
+ <td align="center" style="vertical-align: middle">79.83</td>
214
+ <td align="center" style="vertical-align: middle">4.39</td>
215
+ <td align="center" style="vertical-align: middle"><strong>88.60</strong></td>
216
+ </tr>
217
+
218
+ <tr>
219
+ <td align="center" colspan=8><strong>Long Context</strong></td>
220
+ </tr>
221
+ <tr>
222
+ <td align="center" style="vertical-align: middle">RULER</td>
223
+ <td align="center" style="vertical-align: middle"><strong>95.60</strong></td>
224
+ <td align="center" style="vertical-align: middle">89.66</td>
225
+ <td align="center" style="vertical-align: middle">56.12</td>
226
+ </tr>
227
+ </tbody>
228
+ </table>
229
+
230
+
231
+ ## 4. Deployment
232
+
233
+ > [!Note]
234
+ > You can access JoyAI-LLM Flash API on https://docs.jdcloud.com/cn/jdaip/chat
235
+ > and we provide OpenAI/Anthropic-compatible API for you.
236
+ > Currently, JoyAI-LLM-Flash-Block-INT8 is recommended to run on the following
237
+ > inference engines:
238
+
239
+ * SGLang
240
+
241
+ Deployment examples can be found in the [Model Deployment
242
+ Guide](docs/deploy_guidance.md).
243
+
244
+
245
+
246
+ ## 5. Model Usage
247
+
248
+ The usage demos below demonstrate how to call our official API.
249
+
250
+ For third-party APIs deployed with vLLM or SGLang, please note that:
251
+
252
+ > [!Note] Recommended sampling parameters: `temperature=0.6`, `top_p=1.0`
253
+
254
+ ### Chat Completion
255
+
256
+ This is a simple chat completion script which shows how to call JoyAI-Flash
257
+ API.
258
+
259
+ ```python
260
+ from openai import OpenAI
261
+
262
+ client = OpenAI(base_url="http://IP:PORT/v1", api_key="EMPTY")
263
+
264
+
265
+ def simple_chat(client: OpenAI):
266
+ messages = [
267
+ {
268
+ "role": "user",
269
+ "content": [
270
+ {
271
+ "type": "text",
272
+ "text": "which one is bigger, 9.11 or 9.9? think
273
+ carefully.",
274
+ }
275
+ ],
276
+ },
277
+ ]
278
+ model_name = client.models.list().data[0].id
279
+ response = client.chat.completions.create(
280
+ model=model_name, messages=messages, stream=False, max_tokens=4096
281
+ )
282
+ print(f"response: {response.choices[0].message.content}")
283
+
284
+
285
+ if __name__ == "__main__":
286
+ simple_chat(client)
287
  ```
288
+
289
+
290
+ ### Tool call Completion
291
+
292
+ This is a simple toll call completion script which shows how to call
293
+ JoyAI-Flash API.
294
+
295
+ ```python
296
+ import json
297
+
298
+ from openai import OpenAI
299
+
300
+ client = OpenAI(base_url="http://IP:PORT/v1", api_key="EMPTY")
301
+
302
+
303
+ def my_calculator(expression: str) -> str:
304
+ return str(eval(expression))
305
+
306
+
307
+ def rewrite(expression: str) -> str:
308
+ return str(expression)
309
+
310
+
311
+ def simple_tool_call(client: OpenAI):
312
+ messages = [
313
+ {
314
+ "role": "user",
315
+ "content": [
316
+ {
317
+ "type": "text",
318
+ "text": "use my functions to compute the results for the
319
+ equations: 6+1",
320
+ },
321
+ ],
322
+ },
323
+ ]
324
+ tools = [
325
+ {
326
+ "type": "function",
327
+ "function": {
328
+ "name": "my_calculator",
329
+ "description": "A calculator that can evaluate a mathematical
330
+ equation and compute its results.",
331
+ "parameters": {
332
+ "type": "object",
333
+ "properties": {
334
+ "expression": {
335
+ "type": "string",
336
+ "description": "The mathematical expression to
337
+ evaluate.",
338
+ },
339
+ },
340
+ "required": ["expression"],
341
+ },
342
+ },
343
+ },
344
+ {
345
+ "type": "function",
346
+ "function": {
347
+ "name": "rewrite",
348
+ "description": "Rewrite a given text for improved clarity",
349
+ "parameters": {
350
+ "type": "object",
351
+ "properties": {
352
+ "text": {
353
+ "type": "string",
354
+ "description": "The input text to rewrite",
355
+ }
356
+ },
357
+ },
358
+ },
359
+ },
360
+ ]
361
+ model_name = client.models.list().data[0].id
362
+ response = client.chat.completions.create(
363
+ model=model_name,
364
+ messages=messages,
365
+ temperature=1.0,
366
+ max_tokens=1024,
367
+ tools=tools,
368
+ tool_choice="auto",
369
+ )
370
+ tool_calls = response.choices[0].message.tool_calls
371
+
372
+ results = []
373
+ for tool_call in tool_calls:
374
+ function_name = tool_call.function.name
375
+ function_args = tool_call.function.arguments
376
+ if function_name == "my_calculator":
377
+ result = my_calculator(**json.loads(function_args))
378
+ results.append(result)
379
+ messages.append({"role": "assistant", "tool_calls": tool_calls})
380
+ for tool_call, result in zip(tool_calls, results):
381
+ messages.append(
382
+ {
383
+ "role": "tool",
384
+ "tool_call_id": tool_call.id,
385
+ "name": tool_call.function.name,
386
+ "content": result,
387
+ }
388
+ )
389
+ response = client.chat.completions.create(
390
+ model=model_name,
391
+ messages=messages,
392
+ temperature=1.0,
393
+ max_tokens=1024,
394
+ )
395
+ print(response.choices[0].message.content)
396
+
397
+
398
+ if __name__ == "__main__":
399
+ simple_tool_call(client)
400
+
401
+ ```
402
+
403
+ ---
404
+
405
+ ## 6. License
406
+
407
+ Both the code repository and the model weights are released under the [Modified
408
+ MIT License](LICENSE).