Janmitra 4B Upscaled
Janmitra (जनमित्र — "Friend of the People") is a 4.49B-parameter causal language model specialising in Indian government welfare schemes, targeted at citizens of Andhra Pradesh, Telangana, and rural India.
This model was created by depth-upscaling santosh1101/janmitra-gemma-2b (2.51B, 18 layers) → 4.49B using the SOLAR layer-duplication technique, then stabilised via LoRA fine-tuning on 50,000 curated instruction samples.
Model Details
| Field | Value |
|---|---|
| Developed by | Santosh Guru / Digiedze-Agni |
| Model type | Causal Language Model (decoder-only) |
| Base architecture | Gemma 2B |
| Parameters | 4.49B |
| Layers | 36 (upscaled from 18) |
| Language(s) | English, Telugu, Hindi |
| License | Gemma Community License |
| Finetuned from | santosh1101/janmitra-gemma-2b |
| Upscaling method | SOLAR-style depth upscaling |
How It Was Built — Gemma 2B → 4.49B
Step 1: SOLAR Depth Upscaling
The original janmitra-gemma-2b model has 18 transformer layers and 2.51B parameters. We expanded it to 36 layers / 4.49B parameters using the SOLAR technique (from the paper SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling).
Layer structure after upscaling:
Original: [L0, L1, ..., L17] → 18 layers
Upscaled: [L0–L3] ← first quarter (cloned)
+ [L0–L17] ← full original (cloned)
+ [L4–L13] ← middle 10 layers (cloned again)
+ [L14–L17] ← last quarter (cloned)
= 36 layers total
Each duplicated layer gets fully independent weights via .clone() — this is critical so that save_pretrained does not error on shared tensor references, and so that fine-tuning can diverge each copy independently.
Why this works:
- Duplicated layers start with identical weights → no catastrophic disruption at init
- During fine-tuning the duplicated layers diverge and specialise
- More depth = more representational capacity without training from scratch
Step 2: LoRA Fine-Tuning (on RunPod A40)
After upscaling, the model is stabilised via supervised fine-tuning using Unsloth + LoRA:
| Hyperparameter | Value |
|---|---|
| LoRA rank (r) | 32 |
| LoRA alpha | 32 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Max sequence length | 2048 |
| Batch size | 2 (grad accum ×4 = effective 8) |
| Learning rate | 1e-4 |
| Steps | 300 |
| Precision | bfloat16 |
| Hardware | NVIDIA A40 (48GB VRAM) |
| Quantisation | 4-bit (QLoRA via Unsloth) |
Training Data
50,000 synthetically generated instruction–response pairs covering 10 citizen personas across Indian states:
| Persona | Key Schemes Covered |
|---|---|
| Farmer / Kisan | PM-KISAN, Rythu Bharosa, Rythu Bandhu, PMFBY, Kisan Credit Card |
| Pregnant Woman | PMMVY, Janani Suraksha Yojana, YSR Amma Vodi |
| Student | Post Matric Scholarship, NSP, NMMS, Pragati (AICTE) |
| Unemployed Youth | PMKVY, DDU-GKY, Rajiv Yuva Kiranalu, NCS Portal |
| Small Business Owner | MUDRA Yojana, Startup India, PM Vishwakarma, CGTMSE |
| Senior Citizen | IGNOAPS, Atal Pension Yojana, YSR Pension Kanuka, Aasara Pension |
| Disabled Person | UDID Card, ADIP Scheme, NHFDC Loan, YSR Pension Kanuka |
| Weaver | Chenetha Mithra (AP), National Handloom Development Programme |
| Fisherman | Matsyakara Bharosa (AP), PMMSY, Kisan Credit Card for Fishermen |
| Woman Entrepreneur | Stand-Up India, MUDRA (Women), WE Hub Telangana, YSR EBC Nestham |
Geographic coverage: Andhra Pradesh, Telangana, and pan-India
Data format: Gemma chat template (<start_of_turn>user / <start_of_turn>model)
Intended Uses
Direct Use
- Citizen-facing chatbots for government scheme discovery
- Kiosk / Common Service Centre (CSC) assistants
- Voice bots for rural welfare delivery
- NGO and field-worker tools for scheme eligibility checks
Downstream Use
- Fine-tune further on state-specific scheme databases
- Integrate with RAG pipelines over official government portals
- Multilingual extension (Telugu, Hindi, Kannada)
Out-of-Scope Use
- General-purpose reasoning or coding tasks
- Medical, legal, or financial advice beyond scheme descriptions
- Any use outside Indian government welfare domain without further fine-tuning
How to Get Started
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"Digiedze-agni/janmitra-4b-upscaled",
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Digiedze-agni/janmitra-4b-upscaled")
prompt = "<start_of_turn>user\nI am a farmer in Telangana, what schemes are available for me?<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Via Ollama (after GGUF export):
ollama run janmitra-4b "I am a farmer in Warangal, what government schemes help me?"
Bias, Risks, and Limitations
- Scheme accuracy: Training data is synthetic and may not reflect the latest scheme updates, eligibility thresholds, or state budget revisions. Always verify with official portals.
- Geographic bias: Strongest coverage for AP and Telangana; other states have limited representation.
- Language: Primarily English. Telugu/Hindi responses may be inconsistent without further multilingual fine-tuning.
- Hallucination: Like all LLMs, this model can confidently state incorrect scheme details. Do not use as the sole source for benefit eligibility decisions.
- Depth upscaling artifacts: The SOLAR-expanded layers are initialised from duplicated weights. The 300-step fine-tune stabilises these but does not fully remove initialisation bias in the new layers.
Technical Specifications
Model Architecture
- Base: Gemma 2B (decoder-only transformer)
- Hidden size: 2048
- Attention heads: 8
- KV heads: 1 (GQA)
- Layers: 36 (SOLAR-expanded from 18)
- Vocab size: 256,000
- Context length: 8192 (trained at 2048)
Compute
| Stage | Hardware | Duration |
|---|---|---|
| Depth upscaling | Apple M-series Mac (CPU) | ~15 min (including HF download) |
| LoRA fine-tuning | NVIDIA A40 48GB (RunPod) | ~30 min (300 steps) |
Citation
@misc{janmitra4b2025,
title = {Janmitra 4B: SOLAR Depth-Upscaled Gemma for Indian Government Scheme Discovery},
author = {Santosh Guru and Digiedze-Agni},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/Digiedze-agni/janmitra-4b-upscaled}
}
Model Card Authors
Santosh Guru — Digiedze-Agni
Model Card Contact
- Downloads last month
- 9