YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🌟 InnoSpark 🌟

🚀 Advanced Educational Large Language Model

Language / 语言: English | 中文

📖 Highlights

We are pleased to announce InnoSpark-72B-1124, the latest upgrade of the InnoSpark-72B non-reasoning mode. This update achieves significant breakthroughs across multiple dimensions:

🎓 Enhanced Educational Capabilities

Addressing the practical needs of the education field, the model demonstrates superior performance in key tasks including knowledge explanation, guided problem-solving, interdisciplinary lesson plan design, and contextual question generation, providing stronger support for intelligent educational applications.

🎯 Comprehensive General Capability Improvements

Achieving a qualitative leap in core capabilities including instruction following, logical reasoning, mathematical computation, and code programming, laying a solid foundation for broader application scenarios.

📊 Performance Comparison

	Qwen2.5-72B-Instruct	InnoSpark-72B-0710	InnoSpark-72B-1124
Knowledge
MMLU-Pro	71.01	64.75	75.60
CEVAL	88.15	88.06	88.22
Reasoning
MATH	80.36	78.24	88.50
AIME2024	23.75	26.67	36.67
Coding
LCB_CODE	55.75	59.58	63.42
HUMANEVAL	84.76	86.59	85.37
Alignment
IFEval	88.67	74.72	83.08
Education
KNOWLEDGE EXPLANATION	4.13	4.75	4.61
GUIDED PROBLEM-SOLVING	4.32	4.52	4.93
INTERDISCIPLINARY LESSON PLAN DESIGN	3.87	4.38	4.84
CONTEXTUAL QUESTION GENERATION	4.12	4.56	4.44

🚀 Quickstart

Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(
    "sii-research/InnoSpark-72B-1124",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("sii-research/InnoSpark-72B-1124")

prompt = "Introduce yourself in detail."
messages = [
    {"role": "system", "content": "You are InnoSpark（启创）, created by Shanghai Innovation Institute （上海创智学院） and East China Normal University(华东师范大学). You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

VLLM

We recommend deploying our model using 4 A100 GPUs. You can run the vllm server-side with the following code in terminal:

python -m vllm.entrypoints.openai.api_server --served-model-name InnoSpark --model path/to/InnoSpark --gpu-memory-utilization 0.98 --tensor-parallel-size 4 --port 6000

Then, you can use the following code to deploy client-side:

import requests
import json

def Innospark_stream(inputs,history):
    url = 'http://loaclhost:6000/v1/chat/completions'

    history+=[{"role": "user", "content": inputs},]

    headers = {"User-Agent": "vLLM Client"}

    pload = {
        "model": "InnoSpark",
        "stream": True,
        "messages": history
    }
    response = requests.post(url,
                             headers=headers,
                             json=pload,
                             stream=True)

    for chunk in response.iter_lines(chunk_size=1,
                                     decode_unicode=False,
                                     delimiter=b"\n"):
        if chunk:
            string_data = chunk.decode("utf-8")
            try:
                json_data = json.loads(string_data[6:])
                delta_content = json_data["choices"][0]["delta"]["content"]
                assistant_reply+=delta_content
                yield delta_content
            except KeyError as e:
                delta_content = json_data["choices"][0]["delta"]["role"]
            except json.JSONDecodeError as e:
                history+=[{
                        "role": "assistant",
                        "content": assistant_reply,
                        "tool_calls": []
                    },]
                delta_content='[DONE]'
                assert '[DONE]'==chunk.decode("utf-8")[6:]

inputs='hi'
history=[]
for response_text in Innospark_stream(inputs,history):
    print(response_text,end='')

🏛️ Technical Support

This project is jointly developed by the Shanghai Institute of AI for Education at East China Normal University (ECNU) and the Shanghai Innovation Institute.

📄 License

Please refer to the relevant model pages for specific license information.

🤝 Contact & Collaboration

East China Normal University

_{🚀 Empowering Education with AI}

📚 Citation

If you find our work useful, please cite our papers:

@misc{song2025cultivatinghelpfulpersonalizedcreative,
      title={Cultivating Helpful, Personalized, and Creative AI Tutors: A Framework for Pedagogical Alignment using Reinforcement Learning}, 
      author={Siyu Song and Wentao Liu and Ye Lu and Ruohua Zhang and Tao Liu and Jinze Lv and Xinyun Wang and Aimin Zhou and Fei Tan and Bo Jiang and Hao Hao},
      year={2025},
      eprint={2507.20335},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2507.20335}, 
}

@misc{wei2025elmesautomatedframeworkevaluating,
      title={ELMES: An Automated Framework for Evaluating Large Language Models in Educational Scenarios}, 
      author={Shou'ang Wei and Xinyun Wang and Shuzhen Bi and Jian Chen and Ruijia Li and Bo Jiang and Xin Lin and Min Zhang and Yu Song and BingDong Li and Aimin Zhou and Hao Hao},
      year={2025},
      eprint={2507.22947},
      archivePrefix={arXiv},
      primaryClass={cs.CY},
      url={https://arxiv.org/abs/2507.22947}, 
}

Downloads last month: 22

Safetensors

Model size

73B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sii-research/InnoSpark-72B-1124

Quantizations

2 models

Collection including sii-research/InnoSpark-72B-1124

InnoSpark

Collection

InnoSpark • 19 items • Updated about 1 month ago