Safetensors
qwen2
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🌟 InnoSpark 🌟

Official Website Hugging Face GitHub

🚀 Advanced Educational Large Language Model

Language / 语言: English | 中文


📖 Highlights

We are pleased to announce InnoSpark-72B-1124, the latest upgrade of the InnoSpark-72B non-reasoning mode. This update achieves significant breakthroughs across multiple dimensions:

🎓 Enhanced Educational Capabilities

Addressing the practical needs of the education field, the model demonstrates superior performance in key tasks including knowledge explanation, guided problem-solving, interdisciplinary lesson plan design, and contextual question generation, providing stronger support for intelligent educational applications.

🎯 Comprehensive General Capability Improvements

Achieving a qualitative leap in core capabilities including instruction following, logical reasoning, mathematical computation, and code programming, laying a solid foundation for broader application scenarios.

📊 Performance Comparison

Qwen2.5-72B-Instruct InnoSpark-72B-0710 InnoSpark-72B-1124
Knowledge
MMLU-Pro 71.01 64.75 75.60
CEVAL 88.15 88.06 88.22
Reasoning
MATH 80.36 78.24 88.50
AIME2024 23.75 26.67 36.67
Coding
LCB_CODE 55.75 59.58 63.42
HUMANEVAL 84.76 86.59 85.37
Alignment
IFEval 88.67 74.72 83.08
Education
KNOWLEDGE EXPLANATION 4.13 4.75 4.61
GUIDED PROBLEM-SOLVING 4.32 4.52 4.93
INTERDISCIPLINARY LESSON PLAN DESIGN 3.87 4.38 4.84
CONTEXTUAL QUESTION GENERATION 4.12 4.56 4.44

🚀 Quickstart

Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(
    "sii-research/InnoSpark-72B-1124",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("sii-research/InnoSpark-72B-1124")

prompt = "Introduce yourself in detail."
messages = [
    {"role": "system", "content": "You are InnoSpark(启创), created by Shanghai Innovation Institute (上海创智学院) and East China Normal University(华东师范大学). You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

VLLM

We recommend deploying our model using 4 A100 GPUs. You can run the vllm server-side with the following code in terminal:

python -m vllm.entrypoints.openai.api_server --served-model-name InnoSpark --model path/to/InnoSpark --gpu-memory-utilization 0.98 --tensor-parallel-size 4 --port 6000

Then, you can use the following code to deploy client-side:

import requests
import json

def Innospark_stream(inputs,history):
    url = 'http://loaclhost:6000/v1/chat/completions'

    history+=[{"role": "user", "content": inputs},]

    headers = {"User-Agent": "vLLM Client"}

    pload = {
        "model": "InnoSpark",
        "stream": True,
        "messages": history
    }
    response = requests.post(url,
                             headers=headers,
                             json=pload,
                             stream=True)

    for chunk in response.iter_lines(chunk_size=1,
                                     decode_unicode=False,
                                     delimiter=b"\n"):
        if chunk:
            string_data = chunk.decode("utf-8")
            try:
                json_data = json.loads(string_data[6:])
                delta_content = json_data["choices"][0]["delta"]["content"]
                assistant_reply+=delta_content
                yield delta_content
            except KeyError as e:
                delta_content = json_data["choices"][0]["delta"]["role"]
            except json.JSONDecodeError as e:
                history+=[{
                        "role": "assistant",
                        "content": assistant_reply,
                        "tool_calls": []
                    },]
                delta_content='[DONE]'
                assert '[DONE]'==chunk.decode("utf-8")[6:]

inputs='hi'
history=[]
for response_text in Innospark_stream(inputs,history):
    print(response_text,end='')

🏛️ Technical Support

This project is jointly developed by the Shanghai Institute of AI for Education at East China Normal University (ECNU) and the Shanghai Innovation Institute.

📄 License

Please refer to the relevant model pages for specific license information.


🤝 Contact & Collaboration

East China Normal University

Website Email


🚀 Empowering Education with AI

📚 Citation

If you find our work useful, please cite our papers:

@misc{song2025cultivatinghelpfulpersonalizedcreative,
      title={Cultivating Helpful, Personalized, and Creative AI Tutors: A Framework for Pedagogical Alignment using Reinforcement Learning}, 
      author={Siyu Song and Wentao Liu and Ye Lu and Ruohua Zhang and Tao Liu and Jinze Lv and Xinyun Wang and Aimin Zhou and Fei Tan and Bo Jiang and Hao Hao},
      year={2025},
      eprint={2507.20335},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2507.20335}, 
}
@misc{wei2025elmesautomatedframeworkevaluating,
      title={ELMES: An Automated Framework for Evaluating Large Language Models in Educational Scenarios}, 
      author={Shou'ang Wei and Xinyun Wang and Shuzhen Bi and Jian Chen and Ruijia Li and Bo Jiang and Xin Lin and Min Zhang and Yu Song and BingDong Li and Aimin Zhou and Hao Hao},
      year={2025},
      eprint={2507.22947},
      archivePrefix={arXiv},
      primaryClass={cs.CY},
      url={https://arxiv.org/abs/2507.22947}, 
}
Downloads last month
22
Safetensors
Model size
73B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sii-research/InnoSpark-72B-1124

Quantizations
2 models

Collection including sii-research/InnoSpark-72B-1124