Instructions to use prithivMLmods/QwQ-SuperNatural-3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use prithivMLmods/QwQ-SuperNatural-3B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="prithivMLmods/QwQ-SuperNatural-3B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/QwQ-SuperNatural-3B")
model = AutoModelForCausalLM.from_pretrained("prithivMLmods/QwQ-SuperNatural-3B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use prithivMLmods/QwQ-SuperNatural-3B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "prithivMLmods/QwQ-SuperNatural-3B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/QwQ-SuperNatural-3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/prithivMLmods/QwQ-SuperNatural-3B

SGLang

How to use prithivMLmods/QwQ-SuperNatural-3B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "prithivMLmods/QwQ-SuperNatural-3B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/QwQ-SuperNatural-3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "prithivMLmods/QwQ-SuperNatural-3B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/QwQ-SuperNatural-3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use prithivMLmods/QwQ-SuperNatural-3B with Docker Model Runner:
```
docker model run hf.co/prithivMLmods/QwQ-SuperNatural-3B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

QwQ-SuperNatural-3B

QwQ-SuperNatural-3B is a Qwen2.5-based supernatural model designed to provide context-based supernatural responses from the input it receives. It has 3 billion parameters and is a domain-specific, supervised fine-tuned model. The model demonstrates significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g., tables), and generating structured outputs, especially in JSON format. It is also more resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots.

SuperNatural Colab Demo

Notebook	Description	Link
Colab Demo	Interactive demo for the QwQ-SuperNatural-3B model using Google Colab.	Open in Colab

Quickstart with Transformers

Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/QwQ-SuperNatural-3B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are an Super Natural Bot, You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Intended Use:

QwQ-SuperNatural-3B is designed for:

Role-play and interactive chatbots: It excels in generating contextually relevant and engaging supernatural-themed responses.
Long-form content generation: Its capability to handle over 8,000 tokens makes it suitable for generating detailed narratives, articles, or creative writing.
Structured data understanding: The model can process and interpret structured inputs such as tables, schemas, and JSON formats, making it useful for data-driven applications.
Dynamic prompt responses: Its resilience to diverse prompts makes it ideal for applications requiring adaptable behavior, such as virtual assistants and domain-specific simulations.

Limitations:

Domain specificity: While fine-tuned for supernatural contexts, its general knowledge might be less accurate or nuanced outside this domain.
Token constraints: Although capable of generating long texts, extremely large inputs or outputs might exceed processing limits.
Bias and creativity trade-offs: The model may reflect biases present in its training data and could produce less creative or diverse outputs in domains where it lacks fine-tuning.
Reliance on input clarity: Ambiguous or poorly structured prompts can lead to less coherent or contextually accurate responses.
Computational requirements: Handling a model with 3 billion parameters requires significant computational resources, which may limit its accessibility for smaller-scale applications.

Downloads last month: 9

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for prithivMLmods/QwQ-SuperNatural-3B

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Finetuned

(1307)

this model

Quantizations

2 models

prithivMLmods
/

QwQ-SuperNatural-3B