euclaise's picture

euclaise

euclaise

·

https://euclaise.xyz

euclaise

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum

upvoted a paper 1 day ago

Dynamic Long Context Reasoning over Compressed Memory via End-to-End Reinforcement Learning

upvoted a paper 1 day ago

When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

View all activity

Organizations

upvoted 4 papers 1 day ago

Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum

Paper • 2510.00526 • Published Oct 1, 2025 • 10

Dynamic Long Context Reasoning over Compressed Memory via End-to-End Reinforcement Learning

Paper • 2602.08382 • Published 6 days ago • 10

When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

Paper • 2602.10560 • Published 4 days ago • 27

Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models

Paper • 2602.12036 • Published 2 days ago • 86

upvoted 2 papers 3 days ago

Prism: Spectral-Aware Block-Sparse Attention

Paper • 2602.08426 • Published 6 days ago • 35

iGRPO: Self-Feedback-Driven LLM Reasoning

Paper • 2602.09000 • Published 5 days ago • 14

upvoted a paper 11 days ago

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Paper • 2602.01058 • Published 14 days ago • 39

upvoted 2 papers 13 days ago

Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Paper • 2601.22156 • Published 16 days ago • 11

Discovering Hidden Gems in Model Repositories

Paper • 2601.22157 • Published 16 days ago • 22

liked a model 13 days ago

mit-oasys/rlm-qwen3-8b-v0.1

Updated 16 days ago • 1.52k • 32

upvoted 3 papers 13 days ago

iFSQ: Improving FSQ for Image Generation with 1 Line of Code

Paper • 2601.17124 • Published 22 days ago • 32

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation

Paper • 2601.21420 • Published 17 days ago • 42

Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published 17 days ago • 99

liked a model 13 days ago

moonshotai/Kimi-K2.5

Image-Text-to-Text • 171B • Updated 10 days ago • 726k • • 2.16k

liked a model 14 days ago

ChengyuDu0123/HER-32B

Text Generation • Updated 11 days ago • 101 • 14

liked a dataset 16 days ago

sojuL/RubricHub_v1

Viewer • Updated 12 days ago • 364k • 1.63k • 266

liked 2 models 16 days ago

Alibaba-Apsara/DASD-4B-Thinking

Text Generation • Updated about 1 month ago • 4.11k • 212

MiniMaxAI/MiniMax-M2.1

Text Generation • 229B • Updated 1 day ago • 86.7k • • 1.27k

upvoted 2 papers 16 days ago

Lost in the Prompt Order: Revealing the Limitations of Causal Attention in Language Models

Paper • 2601.14152 • Published 25 days ago • 5

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

Paper • 2601.15165 • Published 24 days ago • 72