πŸ¦… Metanthropic BulBul-OCR

BulBul-OCR is a sovereign, high-efficiency Optical Character Recognition model engineered by Metanthropic. It is a 0.9B parameter vision-language model optimized for speed, accuracy, and secure deployment.


πŸ”’ Sovereign Encryption

This model is distributed in the .mguf (Metanthropic Unified Format). The weights are encrypted using AES-GCM 256-bit encryption to ensure intellectual property protection and authorized usage only.

  • Status: Encrypted
  • Format: Binary MGUF
  • Key Requirement: Yes (Proprietary Access Key)

🧠 Model Details

  • Developer: Metanthropic Research Labs
  • Model Type: Sovereign Vision-Language Model (VLM)
  • Architecture: 0.9B Parameter Vision Transformer (ViT) + Language Decoder
  • Capabilities: High-density text extraction, document understanding, and visual question answering
  • Identity: Fine-tuned to operate as a distinct entity ("BulBul-OCR") separate from its base architecture

πŸ’» Usage

This model cannot be loaded with standard Hugging Face libraries (transformers). It requires the proprietary Metanthropic Loader to decrypt the weights in memory.

Python Implementation

import os
from huggingface_hub import hf_hub_download
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
from transformers import AutoModelForImageTextToText, AutoProcessor

# 1. Configuration
REPO_ID = "metanthropic/BulBul-OCR"
FILENAME = "bulbul-ocr-v1.mguf"
SECRET_KEY = "YOUR_ACCESS_KEY_HERE"  # Provided by Metanthropic Admin

# 2. Download Encrypted Asset
file_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME)

# 3. Secure Decryption (In-Memory)
key_bytes = bytes.fromhex(SECRET_KEY)
aesgcm = AESGCM(key_bytes)

with open(file_path, "rb") as f:
    nonce = f.read(12)
    header_len = int.from_bytes(f.read(4), 'little')
    encrypted_header = f.read(header_len)
    rest_of_body = f.read()

# Decrypt Header
decrypted_header = aesgcm.decrypt(nonce, encrypted_header, None)

# 4. Load Model
# (Note: In production, use a temp file or stream directly to avoid disk writes)
os.makedirs("temp_load", exist_ok=True)
with open("temp_load/model.safetensors", "wb") as f:
    f.write(decrypted_header)
    f.write(rest_of_body)

print("βœ… Model Decrypted. Loading into VRAM...")
model = AutoModelForImageTextToText.from_pretrained(
    "temp_load", 
    trust_remote_code=True, 
    device_map="auto"
)
processor = AutoProcessor.from_pretrained(REPO_ID, trust_remote_code=True)

# 5. Run Inference
from PIL import Image

# Load your image
image = Image.open("document.png")

# Process and generate
inputs = processor(images=image, return_tensors="pt").to(model.device)
generated_ids = model.generate(**inputs, max_new_tokens=512)
result = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(result)

Installation Requirements

pip install transformers huggingface_hub cryptography pillow torch

πŸ“Š Performance Benchmarks

Dataset Accuracy Speed (imgs/sec)
SROIE 94.2% 12.5
FUNSD 91.8% 10.3
RVL-CDIP 89.7% 15.2

πŸš€ Key Features

  • High-Speed Inference: Optimized for real-time OCR applications
  • Multi-Language Support: Primary focus on English with expandable architecture
  • Document Understanding: Beyond OCR - understands layout and structure
  • Sovereign Architecture: Encrypted weights ensure IP protection
  • Low Resource Requirements: Runs efficiently on consumer-grade GPUs

πŸ”§ System Requirements

  • Minimum:

    • GPU: 4GB VRAM (NVIDIA GTX 1650 or equivalent)
    • RAM: 8GB
    • Storage: 2GB
  • Recommended:

    • GPU: 8GB VRAM (NVIDIA RTX 3060 or equivalent)
    • RAM: 16GB
    • Storage: 5GB

⚠️ License & Restrictions

This is a proprietary model released by Metanthropic.

  • Commercial Use: Restricted to authorized partners only
  • Modification: Prohibited without express written consent from Metanthropic
  • Redistribution: The .mguf file may be mirrored, but decryption keys must not be shared publicly
  • Access: Contact Metanthropic Research Labs for licensing and access key provisioning

πŸ“ž Contact & Support


πŸ“œ Citation

If you use BulBul-OCR in your research, please cite:

@misc{bulbul-ocr-2024,
  title={BulBul-OCR: A Sovereign Vision-Language Model for Optical Character Recognition},
  author={Metanthropic Research Labs},
  year={2024},
  publisher={Metanthropic},
  howpublished={\url{https://huggingface.co/metanthropic/BulBul-OCR}}
}

Engineered by Metanthropic. Powered by Sovereign Intelligence.

Downloads last month
64
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for metanthropic/BulBul-OCR

Unable to build the model tree, the base model loops to the model itself. Learn more.