7 128 43

Frank Sommers PRO

fsommers

fsommers

AI & ML interests

None yet

Recent Activity

upvoted an article 1 day ago

Nemotron ColEmbed V2: Raising the Bar for Multimodal Retrieval with ViDoRe V3’s Top Model

upvoted an article 10 days ago

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

liked a model 11 days ago

zai-org/GLM-OCR

View all activity

Organizations

upvoted an article 1 day ago

Article

Nemotron ColEmbed V2: Raising the Bar for Multimodal Retrieval with ViDoRe V3’s Top Model

11 days ago

•

upvoted an article 10 days ago

Article

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Jul 9, 2025

•

780

liked a model 11 days ago

zai-org/GLM-OCR

Image-to-Text • Updated 6 days ago • 856k • 1.04k

upvoted a paper 16 days ago

DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published 18 days ago • 60

upvoted 2 papers 23 days ago

GutenOCR: A Grounded Vision-Language Front-End for Documents

Paper • 2601.14490 • Published 25 days ago • 37

Typhoon OCR: Open Vision-Language Model For Thai Document Extraction

Paper • 2601.14722 • Published 25 days ago • 15

upvoted a collection about 1 month ago

PP-OCRv5

Collection

PP-OCRv5 is the latest text recognition solution, supporting Simplified Chinese, Chinese Pinyin, Traditional Chinese, English, and Japanese • 13 items • Updated Sep 15, 2025 • 52

upvoted a paper 2 months ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 256

upvoted an article 2 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

Dec 1, 2025

•

297

upvoted 2 papers 3 months ago

SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models

Paper • 2511.15605 • Published Nov 19, 2025 • 24

TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval

Paper • 2511.16528 • Published Nov 20, 2025 • 24

liked a model 3 months ago

Qwen/Qwen3-VL-8B-Instruct

Image-Text-to-Text • 9B • Updated Oct 15, 2025 • 2.97M • • 755

upvoted a collection 3 months ago

Qwen3-VL

Collection

37 items • Updated Dec 31, 2025 • 627

liked a model 3 months ago

moonshotai/Kimi-K2-Thinking

Text Generation • Updated 16 days ago • 338k • • 1.67k

upvoted 3 papers 4 months ago

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published Oct 28, 2025 • 101

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16, 2025 • 116

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

Paper • 2510.14979 • Published Oct 16, 2025 • 67

liked a model 4 months ago

Qwen/Qwen3-VL-8B-Thinking

Image-Text-to-Text • 9B • Updated Nov 26, 2025 • 171k • 185

upvoted 2 articles 4 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7, 2025

•

276

Article

ModernVBERT: Towards Smaller Visual Document Retrievers

Oct 3, 2025

•

Frank Sommers PRO

AI & ML interests

Recent Activity

Organizations

fsommers's activity

Nemotron ColEmbed V2: Raising the Bar for Multimodal Retrieval with ViDoRe V3’s Top Model

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Transformers v5: Simple model definitions powering the AI ecosystem

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

ModernVBERT: Towards Smaller Visual Document Retrievers