Instructions to use prithivMLmods/QwQ-SuperNatural-3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use prithivMLmods/QwQ-SuperNatural-3B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="prithivMLmods/QwQ-SuperNatural-3B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/QwQ-SuperNatural-3B") model = AutoModelForCausalLM.from_pretrained("prithivMLmods/QwQ-SuperNatural-3B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use prithivMLmods/QwQ-SuperNatural-3B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "prithivMLmods/QwQ-SuperNatural-3B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/QwQ-SuperNatural-3B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/prithivMLmods/QwQ-SuperNatural-3B
- SGLang
How to use prithivMLmods/QwQ-SuperNatural-3B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "prithivMLmods/QwQ-SuperNatural-3B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/QwQ-SuperNatural-3B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "prithivMLmods/QwQ-SuperNatural-3B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/QwQ-SuperNatural-3B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use prithivMLmods/QwQ-SuperNatural-3B with Docker Model Runner:
docker model run hf.co/prithivMLmods/QwQ-SuperNatural-3B
QwQ-SuperNatural-3B
QwQ-SuperNatural-3B is a Qwen2.5-based supernatural model designed to provide context-based supernatural responses from the input it receives. It has 3 billion parameters and is a domain-specific, supervised fine-tuned model. The model demonstrates significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g., tables), and generating structured outputs, especially in JSON format. It is also more resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots.
SuperNatural Colab Demo
| Notebook | Description | Link |
|---|---|---|
| Colab Demo | Interactive demo for the QwQ-SuperNatural-3B model using Google Colab. | Open in Colab |
Quickstart with Transformers
Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "prithivMLmods/QwQ-SuperNatural-3B"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "system", "content": "You are an Super Natural Bot, You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
Intended Use:
QwQ-SuperNatural-3B is designed for:
- Role-play and interactive chatbots: It excels in generating contextually relevant and engaging supernatural-themed responses.
- Long-form content generation: Its capability to handle over 8,000 tokens makes it suitable for generating detailed narratives, articles, or creative writing.
- Structured data understanding: The model can process and interpret structured inputs such as tables, schemas, and JSON formats, making it useful for data-driven applications.
- Dynamic prompt responses: Its resilience to diverse prompts makes it ideal for applications requiring adaptable behavior, such as virtual assistants and domain-specific simulations.
Limitations:
- Domain specificity: While fine-tuned for supernatural contexts, its general knowledge might be less accurate or nuanced outside this domain.
- Token constraints: Although capable of generating long texts, extremely large inputs or outputs might exceed processing limits.
- Bias and creativity trade-offs: The model may reflect biases present in its training data and could produce less creative or diverse outputs in domains where it lacks fine-tuning.
- Reliance on input clarity: Ambiguous or poorly structured prompts can lead to less coherent or contextually accurate responses.
- Computational requirements: Handling a model with 3 billion parameters requires significant computational resources, which may limit its accessibility for smaller-scale applications.
- Downloads last month
- 9