You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

MWire Labs

Kren Vision

Kren Vision is a fine-tuned vision-language model for optical character recognition (OCR) of Northeast Indian languages. It is part of the Kren AI Stack by MWire Labs, focused on building foundational language technology for Northeast India's indigenous languages.

Built on an open-source vision-language model with LoRA fine-tuning on 618k deduplicated synthetic OCR samples across 6 Latin-script NE languages.

Supported Languages

Language Script
Mizo Latin
Garo Latin
Khasi Latin
Kokborok Latin
Nagamese Latin
Nyishi Latin

Performance

Evaluated on 500 held-out test samples:

Metric Score
Exact Match 92.60%
CER 0.85%

Usage

from transformers import AutoProcessor, AutoModelForImageTextToText
from qwen_vl_utils import process_vision_info
import torch

processor = AutoProcessor.from_pretrained("MWirelabs/kren-vision")
model = AutoModelForImageTextToText.from_pretrained(
    "MWirelabs/kren-vision",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "user", "content": [
        {"type": "image", "image": "your_image.jpg"},
        {"type": "text", "text": "OCR the text in this image."}
    ]}
]

inputs = processor.apply_chat_template(
    messages, tokenize=True, add_generation_prompt=True,
    return_dict=True, return_tensors="pt"
).to(model.device)

generated_ids = model.generate(**inputs, max_new_tokens=128)
trimmed = [out[len(inp):] for inp, out in zip(inputs.input_ids, generated_ids)]
output = processor.batch_decode(trimmed, skip_special_tokens=True)
print(output[0])

Training

  • Data: 618k deduplicated synthetic OCR samples across 6 languages
  • Fine-tuning: LoRA (r=16, alpha=32) on vision and language projection layers
  • Hardware: NVIDIA RTX 6000 Ada (48GB)
  • Epochs: 2

Citation

@misc{kren-vision-2026,
  title={Kren Vision: OCR for Northeast Indian Languages},
  author={MWire Labs},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/MWirelabs/kren-vision}
}

License

CC-BY-4.0 — MWire Labs, 2026

Downloads last month
11
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support