Tempo - a Vision-CAIR Collection

Vision-CAIR 's Collections

Tempo

updated Apr 15

Official Tempo-6B collection: A query-aware framework solving the mismatch between massive video streams and bounded LLM context windows.

Vision-CAIR/Tempo-6B

Video-Text-to-Text • Updated Apr 15 • 124 • 2
Vision-CAIR/Tempo-6B-Stage2

Video-Text-to-Text • Updated Apr 15 • 5
Vision-CAIR/Tempo-6B-Stage1

Video-Text-to-Text • Updated Apr 15 • 14
Vision-CAIR/Tempo-6B-Stage0

Video-Text-to-Text • Updated Apr 15 • 8
Small Vision-Language Models are Smart Compressors for Long Video Understanding

Paper • 2604.08120 • Published Apr 9 • 20
Running on Zero

Agents

5

Tempo

🏃

5

Smart Compressors for Long Video Understanding