Janmitra 4B Upscaled

Janmitra (जनमित्र — "Friend of the People") is a 4.49B-parameter causal language model specialising in Indian government welfare schemes, targeted at citizens of Andhra Pradesh, Telangana, and rural India.

This model was created by depth-upscaling santosh1101/janmitra-gemma-2b (2.51B, 18 layers) → 4.49B using the SOLAR layer-duplication technique, then stabilised via LoRA fine-tuning on 50,000 curated instruction samples.

Model Details

Field	Value
Developed by	Santosh Guru / Digiedze-Agni
Model type	Causal Language Model (decoder-only)
Base architecture	Gemma 2B
Parameters	4.49B
Layers	36 (upscaled from 18)
Language(s)	English, Telugu, Hindi
License	Gemma Community License
Finetuned from	`santosh1101/janmitra-gemma-2b`
Upscaling method	SOLAR-style depth upscaling

How It Was Built — Gemma 2B → 4.49B

Step 1: SOLAR Depth Upscaling

The original janmitra-gemma-2b model has 18 transformer layers and 2.51B parameters. We expanded it to 36 layers / 4.49B parameters using the SOLAR technique (from the paper SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling).

Layer structure after upscaling:

Original: [L0, L1, ..., L17]  → 18 layers

Upscaled: [L0–L3]             ← first quarter  (cloned)
        + [L0–L17]            ← full original  (cloned)
        + [L4–L13]            ← middle 10 layers (cloned again)
        + [L14–L17]           ← last quarter   (cloned)
        = 36 layers total

Each duplicated layer gets fully independent weights via .clone() — this is critical so that save_pretrained does not error on shared tensor references, and so that fine-tuning can diverge each copy independently.

Why this works:

Duplicated layers start with identical weights → no catastrophic disruption at init
During fine-tuning the duplicated layers diverge and specialise
More depth = more representational capacity without training from scratch

Step 2: LoRA Fine-Tuning (on RunPod A40)

After upscaling, the model is stabilised via supervised fine-tuning using Unsloth + LoRA:

Hyperparameter	Value
LoRA rank (r)	32
LoRA alpha	32
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Max sequence length	2048
Batch size	2 (grad accum ×4 = effective 8)
Learning rate	1e-4
Steps	300
Precision	bfloat16
Hardware	NVIDIA A40 (48GB VRAM)
Quantisation	4-bit (QLoRA via Unsloth)

Training Data

50,000 synthetically generated instruction–response pairs covering 10 citizen personas across Indian states:

Persona	Key Schemes Covered
Farmer / Kisan	PM-KISAN, Rythu Bharosa, Rythu Bandhu, PMFBY, Kisan Credit Card
Pregnant Woman	PMMVY, Janani Suraksha Yojana, YSR Amma Vodi
Student	Post Matric Scholarship, NSP, NMMS, Pragati (AICTE)
Unemployed Youth	PMKVY, DDU-GKY, Rajiv Yuva Kiranalu, NCS Portal
Small Business Owner	MUDRA Yojana, Startup India, PM Vishwakarma, CGTMSE
Senior Citizen	IGNOAPS, Atal Pension Yojana, YSR Pension Kanuka, Aasara Pension
Disabled Person	UDID Card, ADIP Scheme, NHFDC Loan, YSR Pension Kanuka
Weaver	Chenetha Mithra (AP), National Handloom Development Programme
Fisherman	Matsyakara Bharosa (AP), PMMSY, Kisan Credit Card for Fishermen
Woman Entrepreneur	Stand-Up India, MUDRA (Women), WE Hub Telangana, YSR EBC Nestham

Geographic coverage: Andhra Pradesh, Telangana, and pan-India
Data format: Gemma chat template (<start_of_turn>user / <start_of_turn>model)

Intended Uses

Direct Use

Citizen-facing chatbots for government scheme discovery
Kiosk / Common Service Centre (CSC) assistants
Voice bots for rural welfare delivery
NGO and field-worker tools for scheme eligibility checks

Downstream Use

Fine-tune further on state-specific scheme databases
Integrate with RAG pipelines over official government portals
Multilingual extension (Telugu, Hindi, Kannada)

Out-of-Scope Use

General-purpose reasoning or coding tasks
Medical, legal, or financial advice beyond scheme descriptions
Any use outside Indian government welfare domain without further fine-tuning

How to Get Started

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "Digiedze-agni/janmitra-4b-upscaled",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Digiedze-agni/janmitra-4b-upscaled")

prompt = "<start_of_turn>user\nI am a farmer in Telangana, what schemes are available for me?<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Via Ollama (after GGUF export):

ollama run janmitra-4b "I am a farmer in Warangal, what government schemes help me?"

Bias, Risks, and Limitations

Scheme accuracy: Training data is synthetic and may not reflect the latest scheme updates, eligibility thresholds, or state budget revisions. Always verify with official portals.
Geographic bias: Strongest coverage for AP and Telangana; other states have limited representation.
Language: Primarily English. Telugu/Hindi responses may be inconsistent without further multilingual fine-tuning.
Hallucination: Like all LLMs, this model can confidently state incorrect scheme details. Do not use as the sole source for benefit eligibility decisions.
Depth upscaling artifacts: The SOLAR-expanded layers are initialised from duplicated weights. The 300-step fine-tune stabilises these but does not fully remove initialisation bias in the new layers.

Technical Specifications

Model Architecture

Base: Gemma 2B (decoder-only transformer)
Hidden size: 2048
Attention heads: 8
KV heads: 1 (GQA)
Layers: 36 (SOLAR-expanded from 18)
Vocab size: 256,000
Context length: 8192 (trained at 2048)

Compute

Stage	Hardware	Duration
Depth upscaling	Apple M-series Mac (CPU)	~15 min (including HF download)
LoRA fine-tuning	NVIDIA A40 48GB (RunPod)	~30 min (300 steps)

Citation

@misc{janmitra4b2025,
  title        = {Janmitra 4B: SOLAR Depth-Upscaled Gemma for Indian Government Scheme Discovery},
  author       = {Santosh Guru and Digiedze-Agni},
  year         = {2025},
  publisher    = {HuggingFace},
  url          = {https://huggingface.co/Digiedze-agni/janmitra-4b-upscaled}
}