FarmBot โ SmolLM2-135M (INT4, LoRA Finetuned)
A LoRA finetune of SmolLM2-135M-Instruct for crop disease assistance, covering 9 crops common in West/Central Africa. Quantized to INT4 (~111MB) for low-resource deployment. Built for the USAII Global AI Hackathon 2026.
Crops Covered
Cassava, Cocoa, Cowpea, Maize, Groundnut, Mango, Plantain, Rice, Tomato
Benchmark Results (15-question held-out test, keyword-match scoring)
| Bucket | Score |
|---|---|
| Overall | 7/15 (46.7%) |
| Crop knowledge | 6/9 (67%) |
| Greetings | 1/2 (50%) |
| Out-of-scope | 0/4 (0%) |
See benchmark_results.json for full per-question results.
This is not a polished production model. Crop-knowledge answers are generally accurate and on-topic (correctly identifies fall armyworm, cassava mosaic, black pod disease, bunchy top, rice blast, cowpea aphids with reasonable treatment advice). Out-of-scope detection is weak in raw model output โ the model often starts the correct decline phrase but drifts into unrelated crop advice instead of stopping. Greetings handling is inconsistent.
Known Limitations
- Out-of-scope detection fails most of the time in raw model output โ a regex pre-filter at the application layer is required before deploying this model to reliably handle off-topic questions (sports, prices, politics, human/animal health, etc.)
- Responses can ramble past the useful answer and drift off-topic toward the end
- 135M parameters โ limited reasoning, trained narrowly on 9 crops only, will not generalize to crops or diseases outside its training data
- Should be treated as an early-stage assistive tool, not a substitute for an agricultural extension officer
Training
- Base: HuggingFaceTB/SmolLM2-135M-Instruct
- LoRA: r=16, alpha=32, dropout=0.05, targeting q/k/v/o projections (~1.84M trainable params)
- Data: ~95,000 quality-filtered examples (deduplicated, length-bounded, repetition-checked), sampled from a larger 365k+ synthetic Q&A dataset generated via Mistral API
- Hardware: Kaggle 2x T4
- Quantization: INT4 nf4 with double quant
Recommended Inference Settings
temperature=0.3, top_k=20, repetition_penalty=1.2
max_new_tokens=120, min_new_tokens=15
eos_token_id=tokenizer.convert_tokens_to_ids('<|im_end|>')
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("rufatronics/farmbot-crop-assistant")
model = AutoModelForCausalLM.from_pretrained("rufatronics/farmbot-crop-assistant", device_map="auto")
prompt = "<|im_start|>user\nmy maize leaves have holes<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(
**inputs, max_new_tokens=120, min_new_tokens=15,
temperature=0.3, top_k=20, do_sample=True, repetition_penalty=1.2,
pad_token_id=tokenizer.eos_token_id,
eos_token_id=tokenizer.convert_tokens_to_ids('<|im_end|>'),
)
print(tokenizer.decode(out[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True))
- Downloads last month
- 29
Model tree for rufatronics/farmbot-crop-assistant
Base model
HuggingFaceTB/SmolLM2-135M