FarmBot โ€” SmolLM2-135M (INT4, LoRA Finetuned)

A LoRA finetune of SmolLM2-135M-Instruct for crop disease assistance, covering 9 crops common in West/Central Africa. Quantized to INT4 (~111MB) for low-resource deployment. Built for the USAII Global AI Hackathon 2026.

Crops Covered

Cassava, Cocoa, Cowpea, Maize, Groundnut, Mango, Plantain, Rice, Tomato

Benchmark Results (15-question held-out test, keyword-match scoring)

Bucket Score
Overall 7/15 (46.7%)
Crop knowledge 6/9 (67%)
Greetings 1/2 (50%)
Out-of-scope 0/4 (0%)

See benchmark_results.json for full per-question results.

This is not a polished production model. Crop-knowledge answers are generally accurate and on-topic (correctly identifies fall armyworm, cassava mosaic, black pod disease, bunchy top, rice blast, cowpea aphids with reasonable treatment advice). Out-of-scope detection is weak in raw model output โ€” the model often starts the correct decline phrase but drifts into unrelated crop advice instead of stopping. Greetings handling is inconsistent.

Known Limitations

  • Out-of-scope detection fails most of the time in raw model output โ€” a regex pre-filter at the application layer is required before deploying this model to reliably handle off-topic questions (sports, prices, politics, human/animal health, etc.)
  • Responses can ramble past the useful answer and drift off-topic toward the end
  • 135M parameters โ€” limited reasoning, trained narrowly on 9 crops only, will not generalize to crops or diseases outside its training data
  • Should be treated as an early-stage assistive tool, not a substitute for an agricultural extension officer

Training

  • Base: HuggingFaceTB/SmolLM2-135M-Instruct
  • LoRA: r=16, alpha=32, dropout=0.05, targeting q/k/v/o projections (~1.84M trainable params)
  • Data: ~95,000 quality-filtered examples (deduplicated, length-bounded, repetition-checked), sampled from a larger 365k+ synthetic Q&A dataset generated via Mistral API
  • Hardware: Kaggle 2x T4
  • Quantization: INT4 nf4 with double quant

Recommended Inference Settings

temperature=0.3, top_k=20, repetition_penalty=1.2
max_new_tokens=120, min_new_tokens=15
eos_token_id=tokenizer.convert_tokens_to_ids('<|im_end|>')

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("rufatronics/farmbot-crop-assistant")
model = AutoModelForCausalLM.from_pretrained("rufatronics/farmbot-crop-assistant", device_map="auto")

prompt = "<|im_start|>user\nmy maize leaves have holes<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(
    **inputs, max_new_tokens=120, min_new_tokens=15,
    temperature=0.3, top_k=20, do_sample=True, repetition_penalty=1.2,
    pad_token_id=tokenizer.eos_token_id,
    eos_token_id=tokenizer.convert_tokens_to_ids('<|im_end|>'),
)
print(tokenizer.decode(out[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True))
Downloads last month
29
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
F16
ยท
U8
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for rufatronics/farmbot-crop-assistant

Adapter
(54)
this model