Text Generation
Transformers
Safetensors
Italian
English
qwen3
lora
fine-tuned
banking
regtech
compliance
rag
tool-calling
italian
conversational
text-generation-inference
Instructions to use Sophia-AI/RegTech-4B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Sophia-AI/RegTech-4B-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Sophia-AI/RegTech-4B-Instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Sophia-AI/RegTech-4B-Instruct") model = AutoModelForCausalLM.from_pretrained("Sophia-AI/RegTech-4B-Instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Sophia-AI/RegTech-4B-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Sophia-AI/RegTech-4B-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Sophia-AI/RegTech-4B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Sophia-AI/RegTech-4B-Instruct
- SGLang
How to use Sophia-AI/RegTech-4B-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Sophia-AI/RegTech-4B-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Sophia-AI/RegTech-4B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Sophia-AI/RegTech-4B-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Sophia-AI/RegTech-4B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Sophia-AI/RegTech-4B-Instruct with Docker Model Runner:
docker model run hf.co/Sophia-AI/RegTech-4B-Instruct
| { | |
| "model_base": "Qwen/Qwen3-4B-Instruct-2507", | |
| "model_name": "RegTech-4B-Instruct", | |
| "dataset": "./train.jsonl", | |
| "env_file": "/home/ubuntu/sophia-core-server/.tuning/.env.4B", | |
| "train_samples": 2330, | |
| "eval_samples": 258, | |
| "params": { | |
| "rank": 4, | |
| "alpha": 8, | |
| "dropout": 0.05, | |
| "lr": 1e-05, | |
| "scheduler": "cosine", | |
| "epochs": 1, | |
| "effective_batch": 8, | |
| "max_seq_length": 4096, | |
| "neftune_alpha": 0.0, | |
| "target_modules": [ | |
| "q_proj", | |
| "k_proj", | |
| "v_proj", | |
| "o_proj", | |
| "gate_proj", | |
| "up_proj", | |
| "down_proj" | |
| ] | |
| }, | |
| "results": { | |
| "total_steps": 292, | |
| "final_train_loss": 1.5045, | |
| "best_eval_loss": 1.601854681968689, | |
| "best_eval_step": 240, | |
| "best_token_accuracy": 0.6812, | |
| "elapsed_minutes": 8.6 | |
| }, | |
| "loss_history": { | |
| "train": [ | |
| [ | |
| 10, | |
| 2.1906 | |
| ], | |
| [ | |
| 20, | |
| 2.0417 | |
| ], | |
| [ | |
| 30, | |
| 2.1217 | |
| ], | |
| [ | |
| 40, | |
| 2.0513 | |
| ], | |
| [ | |
| 50, | |
| 1.9839 | |
| ], | |
| [ | |
| 60, | |
| 1.9423 | |
| ], | |
| [ | |
| 70, | |
| 1.9321 | |
| ], | |
| [ | |
| 80, | |
| 1.8047 | |
| ], | |
| [ | |
| 90, | |
| 1.7045 | |
| ], | |
| [ | |
| 100, | |
| 1.8603 | |
| ], | |
| [ | |
| 110, | |
| 1.721 | |
| ], | |
| [ | |
| 120, | |
| 1.6419 | |
| ], | |
| [ | |
| 130, | |
| 1.5821 | |
| ], | |
| [ | |
| 140, | |
| 1.5593 | |
| ], | |
| [ | |
| 150, | |
| 1.4756 | |
| ], | |
| [ | |
| 160, | |
| 1.4945 | |
| ], | |
| [ | |
| 170, | |
| 1.5168 | |
| ], | |
| [ | |
| 180, | |
| 1.5689 | |
| ], | |
| [ | |
| 190, | |
| 1.3763 | |
| ], | |
| [ | |
| 200, | |
| 1.5759 | |
| ], | |
| [ | |
| 210, | |
| 1.477 | |
| ], | |
| [ | |
| 220, | |
| 1.4889 | |
| ], | |
| [ | |
| 230, | |
| 1.4514 | |
| ], | |
| [ | |
| 240, | |
| 1.441 | |
| ], | |
| [ | |
| 250, | |
| 1.427 | |
| ], | |
| [ | |
| 260, | |
| 1.4423 | |
| ], | |
| [ | |
| 270, | |
| 1.4199 | |
| ], | |
| [ | |
| 280, | |
| 1.457 | |
| ], | |
| [ | |
| 290, | |
| 1.5045 | |
| ] | |
| ], | |
| "eval": [ | |
| [ | |
| 80, | |
| 2.036996841430664 | |
| ], | |
| [ | |
| 160, | |
| 1.6603444814682007 | |
| ], | |
| [ | |
| 240, | |
| 1.601854681968689 | |
| ] | |
| ], | |
| "token_accuracy": [ | |
| [ | |
| 80, | |
| 0.661 | |
| ], | |
| [ | |
| 160, | |
| 0.6759 | |
| ], | |
| [ | |
| 240, | |
| 0.6812 | |
| ] | |
| ] | |
| } | |
| } |