FarmBot — SmolLM2-135M (INT4, LoRA Finetuned)

A LoRA finetune of SmolLM2-135M-Instruct for crop disease assistance, covering 9 crops common in West/Central Africa. Quantized to INT4 (~111MB) for low-resource deployment. Built for the USAII Global AI Hackathon 2026.

Crops Covered

Cassava, Cocoa, Cowpea, Maize, Groundnut, Mango, Plantain, Rice, Tomato

Benchmark Results (15-question held-out test, keyword-match scoring)

Bucket	Score
Overall	7/15 (46.7%)
Crop knowledge	6/9 (67%)
Greetings	1/2 (50%)
Out-of-scope	0/4 (0%)

See benchmark_results.json for full per-question results.

This is not a polished production model. Crop-knowledge answers are generally accurate and on-topic (correctly identifies fall armyworm, cassava mosaic, black pod disease, bunchy top, rice blast, cowpea aphids with reasonable treatment advice). Out-of-scope detection is weak in raw model output — the model often starts the correct decline phrase but drifts into unrelated crop advice instead of stopping. Greetings handling is inconsistent.

Known Limitations

Out-of-scope detection fails most of the time in raw model output — a regex pre-filter at the application layer is required before deploying this model to reliably handle off-topic questions (sports, prices, politics, human/animal health, etc.)
Responses can ramble past the useful answer and drift off-topic toward the end
135M parameters — limited reasoning, trained narrowly on 9 crops only, will not generalize to crops or diseases outside its training data
Should be treated as an early-stage assistive tool, not a substitute for an agricultural extension officer

Training

Base: HuggingFaceTB/SmolLM2-135M-Instruct
LoRA: r=16, alpha=32, dropout=0.05, targeting q/k/v/o projections (~1.84M trainable params)
Data: ~95,000 quality-filtered examples (deduplicated, length-bounded, repetition-checked), sampled from a larger 365k+ synthetic Q&A dataset generated via Mistral API
Hardware: Kaggle 2x T4
Quantization: INT4 nf4 with double quant

Recommended Inference Settings

temperature=0.3, top_k=20, repetition_penalty=1.2
max_new_tokens=120, min_new_tokens=15
eos_token_id=tokenizer.convert_tokens_to_ids('<|im_end|>')

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("rufatronics/farmbot-crop-assistant")
model = AutoModelForCausalLM.from_pretrained("rufatronics/farmbot-crop-assistant", device_map="auto")

prompt = "<|im_start|>user\nmy maize leaves have holes<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(
    **inputs, max_new_tokens=120, min_new_tokens=15,
    temperature=0.3, top_k=20, do_sample=True, repetition_penalty=1.2,
    pad_token_id=tokenizer.eos_token_id,
    eos_token_id=tokenizer.convert_tokens_to_ids('<|im_end|>'),
)
print(tokenizer.decode(out[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True))

Downloads last month: 29

Safetensors

Model size

0.1B params

Tensor type

F32

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rufatronics/farmbot-crop-assistant

Base model

HuggingFaceTB/SmolLM2-135M

Quantized

HuggingFaceTB/SmolLM2-135M-Instruct

Adapter

(54)

this model