Instructions to use krishnamraja13/gemma-4-e4b-opus46-reasoning with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use krishnamraja13/gemma-4-e4b-opus46-reasoning with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="krishnamraja13/gemma-4-e4b-opus46-reasoning") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("krishnamraja13/gemma-4-e4b-opus46-reasoning", dtype="auto") - PEFT
How to use krishnamraja13/gemma-4-e4b-opus46-reasoning with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use krishnamraja13/gemma-4-e4b-opus46-reasoning with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "krishnamraja13/gemma-4-e4b-opus46-reasoning" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "krishnamraja13/gemma-4-e4b-opus46-reasoning", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/krishnamraja13/gemma-4-e4b-opus46-reasoning
- SGLang
How to use krishnamraja13/gemma-4-e4b-opus46-reasoning with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "krishnamraja13/gemma-4-e4b-opus46-reasoning" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "krishnamraja13/gemma-4-e4b-opus46-reasoning", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "krishnamraja13/gemma-4-e4b-opus46-reasoning" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "krishnamraja13/gemma-4-e4b-opus46-reasoning", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use krishnamraja13/gemma-4-e4b-opus46-reasoning with Docker Model Runner:
docker model run hf.co/krishnamraja13/gemma-4-e4b-opus46-reasoning
Gemma 4 E4B Opus4.6 Reasoning
A PEFT LoRA adapter fine-tuned on top of google/gemma-4-e4b-it using the Crownelius/Opus-4.6-Reasoning-2100x-formatted dataset.
This adapter is optimized for:
- structured step-by-step reasoning
- logic puzzles
- planning and decomposition
- algorithm explanations
- conceptual problem solving
- code reasoning workflows
The strongest improvements are visible on:
- multi-step logic puzzles
- algorithm design explanations
- state-tracking tasks
- proof-style conceptual reasoning
The adapter shows strongest gains on deliberate decomposition, planning, and educational reasoning prompts.
Base Model
google/gemma-4-e4b-it
Dataset
Crownelius/Opus-4.6-Reasoning-2100x-formatted
Training Setup
- PEFT LoRA fine-tuning
- 4-bit QLoRA loading
- 2 training epochs
- training max sequence length: 512 tokens
- gradient accumulation: 16
- trained on Google Colab T4
Training Metrics
- training loss: 192.38
- validation loss: 11.95
- entropy: 3.91
- mean token accuracy: 0.0462
- train runtime: 5783 seconds
- train rows: 2010
- validation rows: 106
Example Use
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained(
"google/gemma-4-e4b-it",
device_map="auto"
)
model = PeftModel.from_pretrained(
base_model,
"krishnamraja13/gemma-4-e4b-opus46-reasoning"
)
tokenizer = AutoTokenizer.from_pretrained(
"krishnamraja13/gemma-4-e4b-opus46-reasoning"
)
Requirements
Use a recent version of transformers with Gemma 4 support.
pip install -U transformers peft accelerate bitsandbytes
Known Strengths
This adapter performs best on:
- logic riddles
- switch / state puzzles
- recursive explanation prompts
- dynamic programming intuition
- binary search reasoning
- linked list cycle detection explanations
- proof-style educational prompts
- intermediate reasoning scaffolds and invariant-based explanations
Known Limitations
The adapter is stronger at:
- structured reasoning
- decomposition
- planning
- conceptual explanation
than strict symbolic algebra fidelity.
For exact equation solving, outputs may sometimes over-interpret terse symbolic prompts.
License
This adapter is a derivative of Gemma 4 and follows the Gemma license terms.