khtsly's picture

khtsly

khtsly

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 17 hours ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

upvoted a paper about 18 hours ago

HRM-Text: Efficient Pretraining Beyond Scaling

upvoted a paper 1 day ago

Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

View all activity

Organizations

None yet

upvoted a paper about 17 hours ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published 12 days ago • 189

upvoted a paper about 18 hours ago

HRM-Text: Efficient Pretraining Beyond Scaling

Paper • 2605.20613 • Published 4 days ago • 19

upvoted a paper 1 day ago

Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

Paper • 2605.22791 • Published 3 days ago • 21

upvoted a paper 3 days ago

Generative Recursive Reasoning

Paper • 2605.19376 • Published 4 days ago • 25

upvoted a paper about 1 month ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published Apr 11 • 81

upvoted a paper about 2 months ago

MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

Paper • 2603.23516 • Published Mar 6 • 50

upvoted a paper 2 months ago

Recursive Language Models Meet Uncertainty: The Surprising Effectiveness of Self-Reflective Program Search for Long Context

Paper • 2603.15653 • Published Mar 7 • 12

upvoted 2 collections 3 months ago

Qwen3.5-Abliterated-Opus-4.6-Distilled

Qwen3.5-Abliterated • 0 items • Updated 27 days ago • 1

Qwen3.5-Opus-4.6-Distilled

0 items • Updated 27 days ago • 2