How to use from
vLLM
Install from pip and serve model
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Jackrong/Qwen3.5-4B-Python-Coder"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Jackrong/Qwen3.5-4B-Python-Coder",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'
Use Docker
docker model run hf.co/Jackrong/Qwen3.5-4B-Python-Coder
Quick Links

⚠️ Preview Status

This repository is currently a preview release.

The model is still under active testing, and I am continuing to explore more suitable training settings and optimization strategies. Because of this, the current version should be considered experimental.

Downloading or using this model for serious evaluation is not recommended yet. A more stable version will be released after further testing and parameter tuning.

Downloads last month
46
Safetensors
Model size
5B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Jackrong/Qwen3.5-4B-Python-Coder

Finetuned
Qwen/Qwen3.5-4B
Finetuned
(99)
this model
Quantizations
1 model