AI & ML interests

Community of researchers interested in OpenBuddy Early Access. Please note that this is not an official group for OpenBuddy, and the members have no affiliation with the OpenBuddy team. Every independent researchers can apply to join by submitting a Form available in our GitHub.

raincandy-uΒ 
posted an update 5 months ago
view post
Post
3108
Introducing Rain-v2: Democratizing LLM training on gaming GPUs! ⚑

​Following Rain-100M, we’re scaling up. Rain-v2 features a larger training dataset.

We’ve published a comprehensive blog covering the end-to-end journeyβ€”from raw data collection to rigorous evaluation and safety testing.

​HF Repo: πŸ€— raincandy-u/Rain-v2

​Blog: πŸ“š
https://angelkawaii.xyz/2026/01/29/rain-v2/

​Special thanks to the open-source community and the SmolLM2 team for their foundational work! πŸš€

HuggingFaceTB

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model (2502.02737)
raincandy-uΒ 
posted an update 5 months ago
view post
Post
5713
πŸ€— Just released Rain-100M, an experimental ~97M-parameter Qwen3-style language model trained from random initialization.

Repo: raincandy-u/Rain-100M

Data: HuggingFaceFW/fineweb-edu, ~3B tokens, English only

Tokenizer: custom 16k BPE, context length 4096

Architecture: 12 Transformer layers, hidden size 768, 12 heads, MLP 2048, SiLU, bf16


Rain-100M is a raw base model (not instruction-tuned or safety-aligned), aimed at small-scale research, debugging training pipelines, and CPU/edge experiments. If you run evaluations, finetunes, or visualizations with it, I would be very interested in your results!
  • 3 replies
Β·
AtAndDevΒ 
posted an update 11 months ago
view post
Post
700
Qwen 3 Coder is a personal attack to k2, and I love it.
It achieves near SOTA on LCB while not having reasoning.
Finally people are understanding that reasoning isnt necessary for high benches...

Qwen ftw!

DECENTRALIZE DECENTRALIZE DECENTRALIZE
AtAndDevΒ 
posted an update about 1 year ago
view post
Post
3179
deepseek-ai/DeepSeek-R1-0528

This is the end
  • 1 reply
Β·
AtAndDevΒ 
posted an update about 1 year ago
view post
Post
3157
Llama 4 is out...
  • 3 replies
Β·
AtAndDevΒ 
posted an update over 1 year ago
view post
Post
4394
There seems to multiple paid apps shared here that are based on models on hf, but some ppl sell their wrappers as "products" and promote them here. For a long time, hf was the best and only platform to do oss model stuff but with the recent AI website builders anyone can create a product (really crappy ones btw) and try to sell it with no contribution to oss stuff. Please dont do this, or try finetuning the models you use...
Sorry for filling yall feed with this bs but yk...
  • 6 replies
Β·
AtAndDevΒ 
posted an update over 1 year ago
view post
Post
1682
Gemma 3 seems to be really good at human preference. Just waiting for ppl to see it.
AtAndDevΒ 
posted an update over 1 year ago
view post
Post
2508
@nroggendorff is that you sama?
  • 2 replies
Β·
AtAndDevΒ 
posted an update over 1 year ago
view post
Post
1957
everywhere i go i see his face
AtAndDevΒ 
posted an update over 1 year ago
view post
Post
591
Deepseek gang on fire fr fr
AtAndDevΒ 
posted an update over 1 year ago
view post
Post
1667
R1 is out! And with a lot of other R1 releated models...
AtAndDevΒ 
posted an update over 1 year ago
view post
Post
505
@s3nh Hey man check your discord! Got some news.
  • 4 replies
Β·
NiansuhΒ 
posted an update almost 2 years ago
NiansuhΒ 
posted an update about 2 years ago
raincandy-uΒ 
posted an update about 2 years ago
view post
Post
2691
πŸ€— I trained what is probably the smallest (600k ~) TinyStories model! It really can write grammatically correct stories!

raincandy-u/TinyStories-656K

Try this space based on this minuscule model!

https://huggingface.co/spaces/raincandy-u/Story-Teller

Edit: Moreover, the model weight size is only 1.31MB under bf16, and can be reduced to the 700KB level when using Q8_0 quantization Uβ€’γ‚§β€’*U

Edit: Now 1000K params chat model!

raincandy-u/TinyChat-1776K
  • 3 replies
Β·
NiansuhΒ 
posted an update about 2 years ago
view post
Post
1188
**Model Names:** gpt-4-turbo-preview, gpt-4-vision-preview, gpt-3.5-turbo-16k
**Searchable Models:** Creative, Balanced, Precise

Image creation will be available soon in NiansuhAI.
**Model Name:** DALL-E 3

https://huggingface.co/spaces/NiansuhAI/LLMs1
---
  • 2 replies
Β·
NiansuhΒ 
posted an update about 2 years ago
raincandy-uΒ 
posted an update about 2 years ago
view post
Post
2190
First post, thanks HF! πŸ€—

Here is a Claude 3 Sonnet generated dataset using prompts from WildChat:

raincandy-u/claudy-chat-5k
  • 1 reply
Β·
ff670Β 
updated a Space over 2 years ago