Fused Ion 's picture

3 4 17

Fused Ion

fusedion

·

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago

changcheng967/flashlm-v3-13m

liked a dataset 12 months ago

ibm-granite/GneissWeb

liked a model 12 months ago

DevQuasar/NovaSky-AI.Sky-T1-32B-Flash-GGUF

View all activity

Organizations

None yet

liked a model 3 days ago

changcheng967/flashlm-v3-13m

Updated 4 days ago • 11 • 5

liked a dataset 12 months ago

ibm-granite/GneissWeb

Updated Jul 30, 2025 • 2.12k • 40

liked a model 12 months ago

DevQuasar/NovaSky-AI.Sky-T1-32B-Flash-GGUF

Text Generation • 33B • Updated Feb 21, 2025 • 8 • 1

reacted to schuler's post with 👍🔥 about 1 year ago

Post

7270

📢 New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

🔑 Key Findings:
• 77% parameter reduction.
• Maintained model capabilities.
• Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm

2 replies

·

upvoted a paper about 1 year ago

ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer

Paper • 2501.15570 • Published Jan 26, 2025 • 25

liked a model about 1 year ago

HKUSTAudio/Llasa-3B

Text-to-Speech • 4B • Updated May 10, 2025 • 840 • 525

replied to DrHouseFan-315's post about 1 year ago

This comment has been hidden

liked a model about 1 year ago

ChenDY/NitroFusion

Text-to-Image • Updated Jan 6, 2025 • 136 • 98

liked a dataset over 1 year ago

GAIR/o1-journey

Viewer • Updated Oct 16, 2024 • 327 • 57 • 134

upvoted a paper over 1 year ago

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Paper • 2410.10812 • Published Oct 14, 2024 • 18

liked a dataset over 1 year ago

isaiahbjork/cot-logic-reasoning

Viewer • Updated Sep 6, 2024 • 10.5k • 69 • 17

liked a Space over 1 year ago

Chat-with-OpenAI-o1

Chat with OpenAI's O1‑Preview research model

upvoted a paper over 1 year ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 151

liked 3 models over 1 year ago

Exthalpy/state-0

Text Generation • Updated Sep 20, 2024 • 12

facebook/multi-token-prediction

Updated Jun 18, 2024 • 373

InstantX/FLUX.1-dev-Controlnet-Union

Updated Aug 26, 2024 • 8.68k • 471

liked a dataset over 1 year ago

SkunkworksAI/reasoning-0.01

Viewer • Updated Sep 14, 2024 • 29.9k • 291 • 286

liked 2 models over 1 year ago

upstage/solar-pro-preview-instruct

Text Generation • Updated Sep 20, 2024 • 11.3k • 456

mattshumer/Reflection-Llama-3.1-70B

Text Generation • Updated Sep 24, 2024 • 251 • 1.71k