🦅 Metanthropic BulBul-OCR

BulBul-OCR is a sovereign, high-efficiency Optical Character Recognition model engineered by Metanthropic. It is a 0.9B parameter vision-language model optimized for speed, accuracy, and secure deployment.

🔒 Sovereign Encryption

This model is distributed in the .mguf (Metanthropic Unified Format). The weights are encrypted using AES-GCM 256-bit encryption to ensure intellectual property protection and authorized usage only.

Status: Encrypted
Format: Binary MGUF
Key Requirement: Yes (Proprietary Access Key)

🧠 Model Details

Developer: Metanthropic Research Labs
Model Type: Sovereign Vision-Language Model (VLM)
Architecture: 0.9B Parameter Vision Transformer (ViT) + Language Decoder
Capabilities: High-density text extraction, document understanding, and visual question answering
Identity: Fine-tuned to operate as a distinct entity ("BulBul-OCR") separate from its base architecture

💻 Usage

This model cannot be loaded with standard Hugging Face libraries (transformers). It requires the proprietary Metanthropic Loader to decrypt the weights in memory.

Python Implementation

import os
from huggingface_hub import hf_hub_download
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
from transformers import AutoModelForImageTextToText, AutoProcessor

# 1. Configuration
REPO_ID = "metanthropic/BulBul-OCR"
FILENAME = "bulbul-ocr-v1.mguf"
SECRET_KEY = "YOUR_ACCESS_KEY_HERE"  # Provided by Metanthropic Admin

# 2. Download Encrypted Asset
file_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME)

# 3. Secure Decryption (In-Memory)
key_bytes = bytes.fromhex(SECRET_KEY)
aesgcm = AESGCM(key_bytes)

with open(file_path, "rb") as f:
    nonce = f.read(12)
    header_len = int.from_bytes(f.read(4), 'little')
    encrypted_header = f.read(header_len)
    rest_of_body = f.read()

# Decrypt Header
decrypted_header = aesgcm.decrypt(nonce, encrypted_header, None)

# 4. Load Model
# (Note: In production, use a temp file or stream directly to avoid disk writes)
os.makedirs("temp_load", exist_ok=True)
with open("temp_load/model.safetensors", "wb") as f:
    f.write(decrypted_header)
    f.write(rest_of_body)

print("✅ Model Decrypted. Loading into VRAM...")
model = AutoModelForImageTextToText.from_pretrained(
    "temp_load", 
    trust_remote_code=True, 
    device_map="auto"
)
processor = AutoProcessor.from_pretrained(REPO_ID, trust_remote_code=True)

# 5. Run Inference
from PIL import Image

# Load your image
image = Image.open("document.png")

# Process and generate
inputs = processor(images=image, return_tensors="pt").to(model.device)
generated_ids = model.generate(**inputs, max_new_tokens=512)
result = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(result)

Installation Requirements

pip install transformers huggingface_hub cryptography pillow torch

📊 Performance Benchmarks

Dataset	Accuracy	Speed (imgs/sec)
SROIE	94.2%	12.5
FUNSD	91.8%	10.3
RVL-CDIP	89.7%	15.2

🚀 Key Features

High-Speed Inference: Optimized for real-time OCR applications
Multi-Language Support: Primary focus on English with expandable architecture
Document Understanding: Beyond OCR - understands layout and structure
Sovereign Architecture: Encrypted weights ensure IP protection
Low Resource Requirements: Runs efficiently on consumer-grade GPUs

🔧 System Requirements

Minimum:
- GPU: 4GB VRAM (NVIDIA GTX 1650 or equivalent)
- RAM: 8GB
- Storage: 2GB
Recommended:
- GPU: 8GB VRAM (NVIDIA RTX 3060 or equivalent)
- RAM: 16GB
- Storage: 5GB

⚠️ License & Restrictions

This is a proprietary model released by Metanthropic.

Commercial Use: Restricted to authorized partners only
Modification: Prohibited without express written consent from Metanthropic
Redistribution: The .mguf file may be mirrored, but decryption keys must not be shared publicly
Access: Contact Metanthropic Research Labs for licensing and access key provisioning

📞 Contact & Support

Email: support@metanthropic.ai
Documentation: https://docs.metanthropic.ai/bulbul-ocr
License Inquiries: licensing@metanthropic.ai

📜 Citation

If you use BulBul-OCR in your research, please cite:

@misc{bulbul-ocr-2024,
  title={BulBul-OCR: A Sovereign Vision-Language Model for Optical Character Recognition},
  author={Metanthropic Research Labs},
  year={2024},
  publisher={Metanthropic},
  howpublished={\url{https://huggingface.co/metanthropic/BulBul-OCR}}
}

Engineered by Metanthropic. Powered by Sovereign Intelligence.

Downloads last month: 64

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for metanthropic/BulBul-OCR

Unable to build the model tree, the base model loops to the model itself. Learn more.