Instructions to use saucam/gemma-samvaad-7b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use saucam/gemma-samvaad-7b with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("saucam/gemma-samvaad-7b", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use saucam/gemma-samvaad-7b with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for saucam/gemma-samvaad-7b to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for saucam/gemma-samvaad-7b to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for saucam/gemma-samvaad-7b to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="saucam/gemma-samvaad-7b", max_seq_length=2048, )
Uploaded model
- Developed by: saucam
- License: apache-2.0
- Finetuned from model : google/gemma-7b
This is a finetuned version of gemma-7b on sarvamai/samvaad-hi-v1 hindi dataset using chatml format.
Inference
We can use unsloth for fast inference
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
from unsloth.chat_templates import get_chat_template
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "saucam/gemma-samvaad-7b", # YOUR MODEL YOU USED FOR TRAINING
max_seq_length = 2048,
dtype = None,
load_in_4bit = False,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
tokenizer = get_chat_template(
tokenizer,
chat_template = "chatml",
map_eos_token = True, # Maps <|im_end|> to </s> instead
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
messages = [
{"role": "user", "content": "(9+1)+(5+0). इसे 3 चरणों में हल करें."},
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize = True,
add_generation_prompt = True, # Must add for generation
return_tensors = "pt",
).to("cuda")
outputs = model.generate(input_ids = inputs, max_new_tokens = 512, use_cache = True)
out = tokenizer.batch_decode(outputs)
print(out)
['<bos><|im_start|>user\n(9+1)+(5+0). इसे 3 चरणों में हल करें.<|im_end|>\n
<|im_start|>assistant\n(9+1)+(5+0) को 3 चरणों में हल करने के लिए, हम इसे छोटे भागों में विभाजित कर सकते हैं। पहले चरण में, हम 9 को 1 से जोड़ते हैं, जो 10 देता है। दूसरे चरण में, हम 5 को 0 से जोड़ते हैं, जो 5 देता है। तीसरे चरण में, हम 10 को 5 से जोड़ते हैं, जो 15 देता है। इसलिए, (9+1)+(5+0) का परिणाम 15 है।<|im_end|>
This gemma model was trained 2x faster with Unsloth and Huggingface's TRL library.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for saucam/gemma-samvaad-7b
Base model
google/gemma-7b