Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Building on HF
75.2
TFLOPS
281
21
59
nyuuzyou
PRO
nyuuzyou
Follow
Csplk's profile picture
hobqueer's profile picture
englosaxo's profile picture
285 followers
·
33 following
https://ducks.party/donate
nyuuzyou
nyuuzyou
AI & ML interests
None yet
Recent Activity
posted
an
update
about 20 hours ago
🌐 NNTP Discussion Archives - 387M Messages from Public Newsgroups - https://huggingface.co/datasets/nyuuzyou/nntp-text-387m Here's something different from the code datasets: 20+ years of public discussion archives from NNTP newsgroups. Clean Parquet format, but this time it's conversations instead of code. Key Stats: - 386,629,949 messages from 159,345 newsgroups - 191 GB compressed Parquet storage - Spans 2002-2026 - Multilingual: English, German, French, Italian, Dutch, Polish, Russian, and others - Email addresses redacted for privacy The data is messy in the way real discussions are messy. Spam wasn't filtered out - you get the advertisements, the arguments, the off-topic threads, all of it. If you want sanitized text, this isn't it. If you want to see how people actually talked online before Discord and Reddit took over, here you go. Processing kept it simple: convert everything to UTF-8, remove exact duplicates, strip binary attachments, redact emails. Legacy character encodings were a nightmare - had to handle Windows-1252, ISO-8859 variants, KOI8-R, Shift-JIS, GBK, and others just to get readable text. At least it was fun to do, and I think the result turned out pretty well. I hope someone else will also be able to have fun or gain something useful from this project.
new
activity
1 day ago
nyuuzyou/nntp-text-387m:
[bot] Conversion to Parquet
updated
a dataset
2 days ago
nyuuzyou/nntp-text-387m
View all activity
Organizations
nyuuzyou
's models
23
Sort: Recently updated
nyuuzyou/TowerVision-9B-GGUF
Image-Text-to-Text
•
9B
•
Updated
19 days ago
•
861
nyuuzyou/TowerVision-2B-GGUF
Image-Text-to-Text
•
3B
•
Updated
Dec 2, 2025
•
408
nyuuzyou/EuroVLM-9B-Preview-GGUF
9B
•
Updated
Dec 2, 2025
•
78
•
1
nyuuzyou/EuroMoE-2.6B-A0.6B-Instruct-Preview-GGUF
3B
•
Updated
Dec 2, 2025
•
263
•
3
nyuuzyou/EuroLLM-22B-Preview-GGUF
23B
•
Updated
Dec 2, 2025
•
116
nyuuzyou/EuroLLM-22B-Instruct-Preview-GGUF
23B
•
Updated
Dec 2, 2025
•
29
nyuuzyou/EuroMoE-2.6B-A0.6B-Preview-GGUF
3B
•
Updated
Dec 2, 2025
•
99
•
1
nyuuzyou/Dhanishtha-2.0-preview-0725-GGUF
15B
•
Updated
Dec 2, 2025
•
53
nyuuzyou/EuroVLM-1.7B-Preview-GGUF
2B
•
Updated
Dec 2, 2025
•
61
nyuuzyou/SmolLM2-1.7B-Eagle-GGUF
Text Generation
•
2B
•
Updated
Dec 2, 2025
•
35
nyuuzyou/SmolLM2-360M-Eagle-GGUF
Text Generation
•
0.4B
•
Updated
Dec 2, 2025
•
21
nyuuzyou/SmolLM2-135M-Eagle-GGUF
Text Generation
•
0.1B
•
Updated
Dec 2, 2025
•
122
•
1
nyuuzyou/Orpheus-3B-ASMR
Text-to-Speech
•
3B
•
Updated
May 26, 2025
•
1
•
2
nyuuzyou/Orpheus-3B-ASMR-LoRA
Text-to-Speech
•
Updated
May 26, 2025
nyuuzyou/AircraftFLUX-LoRA
Text-to-Image
•
Updated
May 26, 2025
•
1
•
4
nyuuzyou/Planespotting-YOLO11
Object Detection
•
Updated
May 17, 2025
•
4
•
1
nyuuzyou/Qwen2.5-0.5B-Bluesky-Instruct
Text Generation
•
0.5B
•
Updated
Apr 28, 2025
•
13
•
3
nyuuzyou/Qwen2.5-0.5B-Bluesky
Text Generation
•
0.5B
•
Updated
Apr 27, 2025
•
4
nyuuzyou/SmolLM2-1.7B-Eagle
Text Generation
•
2B
•
Updated
Apr 18, 2025
•
1
nyuuzyou/SmolLM2-360M-Eagle
Text Generation
•
0.4B
•
Updated
Apr 18, 2025
•
2
nyuuzyou/SmolLM2-135M-Eagle
Text Generation
•
0.1B
•
Updated
Apr 18, 2025
•
1
•
3
nyuuzyou/stickers
Image Classification
•
Updated
Aug 20, 2023
•
4
nyuuzyou/AnimeHeads
Object Detection
•
Updated
Apr 16, 2023
•
9