Stefan Schweter's picture

In a Training Loop 🔄

Stefan Schweter PRO

stefan-it

·

https://schweter.bayern

AI & ML interests

Flair Library 💕, NER & PoS Tagging, LM Pretraining (mostly encoder-only & encoder-decoder), Historical Language Models, German Language Models, Bavarian NLP 🥨

Recent Activity

upvoted a paper 1 day ago

LoRA-Squeeze: Simple and Effective Post-Tuning and In-Tuning Compression of LoRA Modules

upvoted a paper 1 day ago

Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning

commented on a paper 1 day ago

Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning

View all activity

Organizations

upvoted 2 papers 1 day ago

LoRA-Squeeze: Simple and Effective Post-Tuning and In-Tuning Compression of LoRA Modules

Paper • 2602.10993 • Published 2 days ago • 1

Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning

Paper • 2602.11149 • Published 2 days ago • 12

commented a paper 1 day ago

Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning

Paper • 2602.11149 • Published 2 days ago • 12 •

liked a model 1 day ago

windprak/open_steuerllm

Text Generation • 28B • Updated 1 day ago • 21 • 1

upvoted a paper 1 day ago

SteuerLLM: Local specialized large language model for German tax law analysis

Paper • 2602.11081 • Published 2 days ago • 1

upvoted a collection 2 days ago

GLM-5

2 items • Updated 2 days ago • 19

updated a dataset 3 days ago

bavarian-nlp/barwiki-dumps

Updated 3 days ago • 51

upvoted a paper 4 days ago

Optimal Turkish Subword Strategies at Scale: Systematic Evaluation of Data, Vocabulary, Morphology Interplay

Paper • 2602.06942 • Published 7 days ago • 2

liked 2 datasets 8 days ago

utter-project/EuroBlocks-SFT-2512

Viewer • Updated 8 days ago • 1.09M • 225 • 6

fineinstructions/fineinstructions_nemotron

Viewer • Updated 15 days ago • 1.23B • 2.2k • 4

upvoted a collection 10 days ago

GLiNER- Linker

GLiNER-bi-Encoder models for entity linking with the GLiNKER framework • 3 items • Updated 10 days ago • 6

submitted a paper to Daily Papers 14 days ago

FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale

Paper • 2601.22146 • Published 15 days ago • 9

commented a paper 14 days ago

FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale

Paper • 2601.22146 • Published 15 days ago • 9 •

liked a dataset 15 days ago

fineinstructions/finetemplates

Viewer • Updated 15 days ago • 18.6M • 243 • 2

upvoted a paper 15 days ago

FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale

Paper • 2601.22146 • Published 15 days ago • 9

liked a Space 19 days ago

OCR Dataset Generator

Generate synthetic OCR datasets for low-resource languages

upvoted a collection 22 days ago

GutenOCR

3 items • Updated 22 days ago • 6

upvoted 2 papers 23 days ago

Say Anything but This: When Tokenizer Betrays Reasoning in LLMs

Paper • 2601.14658 • Published 24 days ago • 1

GutenOCR: A Grounded Vision-Language Front-End for Documents

Paper • 2601.14490 • Published 24 days ago • 37

liked a model 28 days ago

nvidia/Nemotron-Orchestrator-8B

Text Generation • Updated Dec 2, 2025 • 14.4k • 549