In a Training Loop 🔄

11 9 83

Bal Narendra Sapa

bnsapa

AI & ML interests

None yet

Recent Activity

liked a model about 2 months ago

Lightricks/LTX-2.3

liked a model 2 months ago

sarvamai/sarvam-105b

upvoted a paper 3 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

View all activity

Organizations

None yet

upvoted a paper 3 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 449

upvoted a paper 10 months ago

Improved Baselines with Visual Instruction Tuning

Paper • 2310.03744 • Published Oct 5, 2023 • 39

upvoted a changelog 11 months ago

Hugging Face Changelog

Connect Your MCP Client to the Hugging Face Hub

Jun 6, 2025

• 114

upvoted an article 11 months ago

Article

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

toslali-ibm, mirinflim, qgallouedec, esnible, rganti, mudhakar

•

Jun 3, 2025

• 101

upvoted a collection 11 months ago

sarvam-m

Collection

Collection of all variations of the sarvam-m model • 3 items • Updated May 24, 2025 • 28

upvoted a collection about 1 year ago

Llama 4

Collection

Llama 4 release • 13 items • Updated Apr 29, 2025 • 735

upvoted 3 articles almost 2 years ago

Article

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

ybelkada, timdettmers

•

Aug 17, 2022

• 132

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

ybelkada, timdettmers, artidoro, sgugger, smangrul

•

May 24, 2023

• 180

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

philschmid, osanseviero, alvarobartt, lvwerra, dvilasuero, reach-vb, marcsun13, pcuenq

•

Jul 23, 2024

• 241

Bal Narendra Sapa

AI & ML interests

Recent Activity

Organizations

bnsapa's activity

Connect Your MCP Client to the Hugging Face Hub

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context