📖 Highlights
We are pleased to announce InnoSpark-72B-1124, the latest upgrade of the InnoSpark-72B non-reasoning mode. This update achieves significant breakthroughs across multiple dimensions:
🎓 Enhanced Educational Capabilities
Addressing the practical needs of the education field, the model demonstrates superior performance in key tasks including knowledge explanation, guided problem-solving, interdisciplinary lesson plan design, and contextual question generation, providing stronger support for intelligent educational applications.
🎯 Comprehensive General Capability Improvements
Achieving a qualitative leap in core capabilities including instruction following, logical reasoning, mathematical computation, and code programming, laying a solid foundation for broader application scenarios.
📊 Performance Comparison
| Qwen2.5-72B-Instruct | InnoSpark-72B-0710 | InnoSpark-72B-1124 | |
|---|---|---|---|
| Knowledge | |||
| MMLU-Pro | 71.01 | 64.75 | 75.60 |
| CEVAL | 88.15 | 88.06 | 88.22 |
| Reasoning | |||
| MATH | 80.36 | 78.24 | 88.50 |
| AIME2024 | 23.75 | 26.67 | 36.67 |
| Coding | |||
| LCB_CODE | 55.75 | 59.58 | 63.42 |
| HUMANEVAL | 84.76 | 86.59 | 85.37 |
| Alignment | |||
| IFEval | 88.67 | 74.72 | 83.08 |
| Education | |||
| KNOWLEDGE EXPLANATION | 4.13 | 4.75 | 4.61 |
| GUIDED PROBLEM-SOLVING | 4.32 | 4.52 | 4.93 |
| INTERDISCIPLINARY LESSON PLAN DESIGN | 3.87 | 4.38 | 4.84 |
| CONTEXTUAL QUESTION GENERATION | 4.12 | 4.56 | 4.44 |
🚀 Quickstart
Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto
model = AutoModelForCausalLM.from_pretrained(
"sii-research/InnoSpark-72B-1124",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("sii-research/InnoSpark-72B-1124")
prompt = "Introduce yourself in detail."
messages = [
{"role": "system", "content": "You are InnoSpark(启创), created by Shanghai Innovation Institute (上海创智学院) and East China Normal University(华东师范大学). You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
VLLM
We recommend deploying our model using 4 A100 GPUs. You can run the vllm server-side with the following code in terminal:
python -m vllm.entrypoints.openai.api_server --served-model-name InnoSpark --model path/to/InnoSpark --gpu-memory-utilization 0.98 --tensor-parallel-size 4 --port 6000
Then, you can use the following code to deploy client-side:
import requests
import json
def Innospark_stream(inputs,history):
url = 'http://loaclhost:6000/v1/chat/completions'
history+=[{"role": "user", "content": inputs},]
headers = {"User-Agent": "vLLM Client"}
pload = {
"model": "InnoSpark",
"stream": True,
"messages": history
}
response = requests.post(url,
headers=headers,
json=pload,
stream=True)
for chunk in response.iter_lines(chunk_size=1,
decode_unicode=False,
delimiter=b"\n"):
if chunk:
string_data = chunk.decode("utf-8")
try:
json_data = json.loads(string_data[6:])
delta_content = json_data["choices"][0]["delta"]["content"]
assistant_reply+=delta_content
yield delta_content
except KeyError as e:
delta_content = json_data["choices"][0]["delta"]["role"]
except json.JSONDecodeError as e:
history+=[{
"role": "assistant",
"content": assistant_reply,
"tool_calls": []
},]
delta_content='[DONE]'
assert '[DONE]'==chunk.decode("utf-8")[6:]
inputs='hi'
history=[]
for response_text in Innospark_stream(inputs,history):
print(response_text,end='')
🏛️ Technical Support
This project is jointly developed by the Shanghai Institute of AI for Education at East China Normal University (ECNU) and the Shanghai Innovation Institute.
📄 License
Please refer to the relevant model pages for specific license information.
📚 Citation
If you find our work useful, please cite our papers:
@misc{song2025cultivatinghelpfulpersonalizedcreative,
title={Cultivating Helpful, Personalized, and Creative AI Tutors: A Framework for Pedagogical Alignment using Reinforcement Learning},
author={Siyu Song and Wentao Liu and Ye Lu and Ruohua Zhang and Tao Liu and Jinze Lv and Xinyun Wang and Aimin Zhou and Fei Tan and Bo Jiang and Hao Hao},
year={2025},
eprint={2507.20335},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2507.20335},
}
@misc{wei2025elmesautomatedframeworkevaluating,
title={ELMES: An Automated Framework for Evaluating Large Language Models in Educational Scenarios},
author={Shou'ang Wei and Xinyun Wang and Shuzhen Bi and Jian Chen and Ruijia Li and Bo Jiang and Xin Lin and Min Zhang and Yu Song and BingDong Li and Aimin Zhou and Hao Hao},
year={2025},
eprint={2507.22947},
archivePrefix={arXiv},
primaryClass={cs.CY},
url={https://arxiv.org/abs/2507.22947},
}
- Downloads last month
- 22