Instructions to use stepfun-ai/Step-3.5-Flash with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use stepfun-ai/Step-3.5-Flash with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="stepfun-ai/Step-3.5-Flash", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("stepfun-ai/Step-3.5-Flash", trust_remote_code=True, dtype="auto")

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use stepfun-ai/Step-3.5-Flash with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "stepfun-ai/Step-3.5-Flash"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.5-Flash",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/stepfun-ai/Step-3.5-Flash

SGLang

How to use stepfun-ai/Step-3.5-Flash with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "stepfun-ai/Step-3.5-Flash" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.5-Flash",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "stepfun-ai/Step-3.5-Flash" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.5-Flash",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use stepfun-ai/Step-3.5-Flash with Docker Model Runner:
```
docker model run hf.co/stepfun-ai/Step-3.5-Flash
```

Step-3.5-Flash / chat_template.jinja

WinstonDeng

upload step3p5_flash_release_mtp3_bf16

8fb8cbc verified 4 months ago

raw

history blame contribute delete

5 kB

	{% macro render_content(content) %}{% if content is none %}{{- '' }}{% elif content is string %}{{- content }}{% elif content is mapping %}{{- content['value'] if 'value' in content else content['text'] }}{% elif content is iterable %}{% for item in content %}{% if item.type == 'text' %}{{- item['value'] if 'value' in item else item['text'] }}{% elif item.type == 'image' %}<im_patch>{% endif %}{% endfor %}{% endif %}{% endmacro %}
	{{bos_token}}{%- if tools %}
	{{- '<\|im_start\|>system\n' }}
	{%- if messages[0].role == 'system' %}
	{{- render_content(messages[0].content) + '\n\n' }}
	{%- endif %}
	{{- "# Tools\n\nYou have access to the following functions in JSONSchema format:\n\n<tools>" }}
	{%- for tool in tools %}
	{{- "\n" }}
	{{- tool \| tojson(ensure_ascii=False) }}
	{%- endfor %}
	{{- "\n</tools>\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...>\n...\n</function> block must be nested within <tool_call>\n...\n</tool_call> XML tags\n- Required parameters MUST be specified\n</IMPORTANT><\|im_end\|>\n" }}
	{%- else %}
	{%- if messages[0].role == 'system' %}
	{{- '<\|im_start\|>system\n' + render_content(messages[0].content) + '<\|im_end\|>\n' }}
	{%- endif %}
	{%- endif %}
	{%- set ns = namespace(multi_step_tool=true, last_query_index=messages\|length - 1) %}
	{%- for message in messages[::-1] %}
	{%- set index = (messages\|length - 1) - loop.index0 %}
	{%- if ns.multi_step_tool and message.role == "user" and render_content(message.content) is string and not(render_content(message.content).startswith('<tool_response>') and render_content(message.content).endswith('</tool_response>')) %}
	{%- set ns.multi_step_tool = false %}
	{%- set ns.last_query_index = index %}
	{%- endif %}
	{%- endfor %}
	{%- for message in messages %}
	{%- set content = render_content(message.content) %}
	{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
	{%- set role_name = 'observation' if (message.role == "system" and not loop.first and message.name == 'observation') else message.role %}
	{{- '<\|im_start\|>' + role_name + '\n' + content + '<\|im_end\|>' + '\n' }}
	{%- elif message.role == "assistant" %}
	{%- if message.reasoning_content is string %}
	{%- set reasoning_content = render_content(message.reasoning_content) %}
	{%- else %}
	{%- if '</think>' in content %}
	{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
	{%- set content = content.split('</think>')[-1].lstrip('\n') %}
	{%- else %}
	{%- set reasoning_content = '' %}
	{%- endif %}
	{%- endif %}
	{%- if loop.index0 > ns.last_query_index %}
	{{- '<\|im_start\|>' + message.role + '\n<think>\n' + reasoning_content + '\n</think>\n' + content }}
	{%- else %}
	{{- '<\|im_start\|>' + message.role + '\n' + content }}
	{%- endif %}
	{%- if message.tool_calls %}
	{%- for tool_call in message.tool_calls %}
	{%- if tool_call.function is defined %}
	{%- set tool_call = tool_call.function %}
	{%- endif %}
	{{- '<tool_call>\n<function=' + tool_call.name + '>\n' }}
	{%- if tool_call.arguments is defined %}
	{%- set arguments = tool_call.arguments %}
	{%- for args_name, args_value in arguments\|items %}
	{{- '<parameter=' + args_name + '>\n' }}
	{%- set args_value = args_value \| tojson(ensure_ascii=False) \| safe if args_value is mapping or (args_value is sequence and args_value is not string) else args_value \| string %}
	{{- args_value }}
	{{- '\n</parameter>\n' }}
	{%- endfor %}
	{%- endif %}
	{{- '</function>\n</tool_call>' }}
	{%- endfor %}
	{%- endif %}
	{{- '<\|im_end\|>\n' }}
	{%- elif message.role == "tool" %}
	{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
	{{- '<\|im_start\|>tool_response\n' }}
	{%- endif %}
	{{- '<tool_response>' }}
	{{- content }}
	{{- '</tool_response>' }}
	{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
	{{- '<\|im_end\|>\n' }}
	{%- endif %}
	{%- endif %}
	{%- endfor %}
	{%- if add_generation_prompt %}
	{{- '<\|im_start\|>assistant\n<think>\n' }}
	{%- endif %}