gg-hf

Team

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

akhaliq submitted a paper 5 days ago

FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation

akhaliq submitted a paper 5 days ago

Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow

bzhanggo updated a model 22 days ago

google/t5gemma-2-4b-4b

View all activity

pcuenq

posted an update 2 days ago

Post

2320

👉 What happened in AI in 2025? 👈

We prepared the 2025 version of the HF AI Timeline Grid, highlighting open vs API-based model releases, and allowing you to browse and filter by access, modality, and release type!

Play with it here:
2025-ai-timeline/2025-ai-timeline

Here's my personal quarterly TL;DR:

1️⃣ Q1 — Learning to Reason
Deepseek not only releases a top-notch reasoning model, but shows how to train them and compete with closed frontier models. OpenAI debuts Deep Research.

Significant milestones: DeepSeek R1 & R1-Zero, Qwen 2.5 VL, OpenAI Deep Research, Gemini 2.5 Pro (experimental)

2️⃣ Q2 — Multimodality and Coding
More LLMs embrace multimodality by default, and there's a surge in coding agents. Strong vision, audio, and generative models emerge.

Significant milestones: Llama 4, Qwen 3, Imagen 4, OpenAI Codex, Google Jules, Claude 4

3️⃣ Q3 — "Gold" rush, OpenAI opens up, the community goes bananas
Flagship models get gold in Math olympiads and hard benchmarks. OpenAI releases strong open source models and Google releases the much anticipated nano-banana for image generation and editing. Agentic workflows become commonplace.

Significant milestones: Gemini and OpenAI IMO Gold, gpt-oss, Gemini 2.5 Flash Image, Grok 4, Claude Sonnet 4.5

4️⃣ Q4 — Mistral returns, leaderboard hill-climbing
Mistral is back with updated model families. All labs release impressive models to wrap up the year!

Significant milestones: Claude Opus 4.5, DeepSeek Math V2, FLUX 2, GPT 5.1, Kimi K2 Thinking, Nano Banana Pro, GLM 4.7, Gemini 3, Mistral 3, MiniMax M2.1 🤯

Credits
🙏 NHLOCAL for the source data https://github.com/NHLOCAL/AiTimeline

🫡 @reach-vb for the original idea, design and recipe

🙌 @ariG23498 and yours truly for compiling and verifying the 2025 edition

🥳 Here's to 2026, wishing it becomes the best year ever for open releases and on-device-first use-cases! 🥂

1 reply

akhaliq

submitted 2 papers to Daily Papers 5 days ago

FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation

Paper • 2512.24724 • Published 7 days ago • 5

Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow

Paper • 2512.24766 • Published 7 days ago • 7

bzhanggo

updated 3 models 22 days ago

akhaliq

submitted a paper to Daily Papers 22 days ago

What matters for Representation Alignment: Global Information or Spatial Structure?

Paper • 2512.10794 • Published 27 days ago • 8

akhaliq

submitted a paper to Daily Papers 27 days ago

Towards a Science of Scaling Agent Systems

Paper • 2512.08296 • Published 30 days ago • 14

akhaliq

submitted a paper to Daily Papers 29 days ago

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Paper • 2512.07843 • Published Nov 24, 2025 • 21

lvwerra

authored a paper 3 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9, 2025 • 36

akhaliq

authored a paper 3 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9, 2025 • 36

bebechien

authored a paper 3 months ago

EmbeddingGemma: Powerful and Lightweight Text Representations

Paper • 2509.20354 • Published Sep 24, 2025 • 42

Marksherwood

authored a paper 3 months ago

EmbeddingGemma: Powerful and Lightweight Text Representations

Paper • 2509.20354 • Published Sep 24, 2025 • 42

osanseviero

authored a paper 3 months ago

EmbeddingGemma: Powerful and Lightweight Text Representations

Paper • 2509.20354 • Published Sep 24, 2025 • 42

siriuz42

updated 2 models 4 months ago

google/timesfm-2.5-200m-pytorch

Time Series Forecasting • Updated Oct 2, 2025 • 184

google/timesfm-2.5-200m-flax

Updated Sep 2, 2025 • 8

Xenova

posted an update 5 months ago

Post

14292

Okay this is insane... WebGPU-accelerated semantic video tracking, powered by DINOv3 and Transformers.js! 🤯
Demo (+ source code): webml-community/DINOv3-video-tracking

This will revolutionize AI-powered video editors... which can now run 100% locally in your browser, no server inference required (costs $0)! 😍

How does it work? 🤔
1️⃣ Generate and cache image features for each frame
2️⃣ Create a list of embeddings for selected patch(es)
3️⃣ Compute cosine similarity between each patch and the selected patch(es)
4️⃣ Highlight those whose score is above some threshold

... et voilà! 🥳

You can also make selections across frames to improve temporal consistency! This is super useful if the object changes its appearance slightly throughout the video.

Excited to see what the community builds with it!