A real-time Streaming Data to RAG system that listens to live radio, transcribes it on-the-fly, and lets you query across TIME.
Not just "what was discussed" – but "what happened in the last 10 minutes on channel 0?" or "at 9 AM, what was the breaking news?" This is RAG that understands temporal context.
1 reply
·
reacted to samerzaher80's
post with 👍about 2 months ago
AetherMind_SRL: How I beat 7B models on MMLU with 184M params and a $300 GPU I’m Sameer, a solo researcher from Iraq working on a single RTX 3050 8GB laptop.Today I’m releasing AetherMind_SRL – a 184M-parameter NLI model that was trained only on tasks (SNLI, MNLI, ANLI, and a small clinical Alzheimer’s dataset). It was never fine-tuned or even shown a single MMLU question during training.Yet here are the zero-shot MMLU (57 subjects) results:Model MMLU Zero-Shot Training Data AetherMind_SRL (me) 184M 36.05 % Only NLI (SNLI/MNLI/ANLI + ADNI) DeBERTa-v3-base 278M ~30.8 % General pre-training BERT-large 340M 27–30 % General pre-training LLaMA-1 7B 7B 34–35 % Massive text corpus LLaMA-2 7B 7B ~45 % Bigger + better data
Yes – my 184M model beats every classic 300–400M model and the original 7-billion-parameter LLaMA-1, all while running at 300+ samples/sec on a $300 laptop GPU.How did this happen?I built a standardized self-improvement loop called AetherMind Self-Reflective Learning (SRL) v1.0:Train normally on NLI Let the model predict on hard adversarial data (ANLI) Log every mistake + low-confidence case Build a balanced “SMART” buffer (60% errors + 40% correct anchors) Fine-tune with tiny LR and error-weighted loss Repeat until stable That’s it. No external knowledge, no MMLU data, no cluster. Just pure reasoning transfer from entailment/contradiction patterns → real-world knowledge.Try it yourself python from transformers import pipeline import torch
🤗 Sentence Transformers is joining Hugging Face! 🤗 This formalizes the existing maintenance structure, as I've personally led the project for the past two years on behalf of Hugging Face! Details:
Today, the Ubiquitous Knowledge Processing (UKP) Lab is transferring the project to Hugging Face. Sentence Transformers will remain a community-driven, open-source project, with the same open-source license (Apache 2.0) as before. Contributions from researchers, developers, and enthusiasts are welcome and encouraged. The project will continue to prioritize transparency, collaboration, and broad accessibility.
We see an increasing wish from companies to move from large LLM APIs to local models for better control and privacy, reflected in the library's growth: in just the last 30 days, Sentence Transformer models have been downloaded >270 million times, second only to transformers.
I would like to thank the UKP Lab, and especially Nils Reimers and Iryna Gurevych, both for their dedication to the project and for their trust in myself, both now and two years ago. Back then, neither of you knew me well, yet you trusted me to take the project to new heights. That choice ended up being very valuable for the embedding & Information Retrieval community, and I think this choice of granting Hugging Face stewardship will be similarly successful.
I'm very excited about the future of the project, and for the world of embeddings and retrieval at large!
Takara takes 3rd place in the {tech:munich} AI hackathon with Fudeno!
A little over 2 weeks ago @aldigobbler and I set out to create the largest MultiModal SVG dataset ever created, we succeeded in this and when I was in Munich, Germany I took it one step further and made an entire app with it!
We fine-tuned Mistral Small, made a Next.JS application and blew some minds, taking 3rd place out of over 100 hackers. So cool!
👀 Multimodal > Mistral AI released a 24B vision LM, both base and instruction FT versions, sota 🔥 (OS) > with IBM we released SmolDocling, a sota 256M document parser with Apache 2.0 license (OS) > SpatialLM is a new vision LM that outputs 3D bounding boxes, comes with 0.5B (QwenVL based) and 1B (Llama based) variants > SkyWork released SkyWork-R1V-38B, new vision reasoning model (OS)
💬 LLMs > NVIDIA released new Nemotron models in 49B and 8B with their post-training dataset > LG released EXAONE, new reasoning models in 2.4B, 7.8B and 32B > Dataset: Glaive AI released a new reasoning dataset of 22M+ examples > Dataset: NVIDIA released new helpfulness dataset HelpSteer3 > Dataset: OpenManusRL is a new agent dataset based on ReAct framework (OS) > Open-R1 team released OlympicCoder, new competitive coder model in 7B and 32B > Dataset: GeneralThought-430K is a new reasoning dataset (OS)
🖼️ Image Generation/Computer Vision > Roboflow released RF-DETR, new real-time sota object detector (OS) 🔥 > YOLOE is a new real-time zero-shot object detector with text and visual prompts 🥹 > Stability AI released Stable Virtual Camera, a new novel view synthesis model > Tencent released Hunyuan3D-2mini, new small and fast 3D asset generation model > ByteDance released InfiniteYou, new realistic photo generation model > StarVector is a new 8B model that generates svg from images > FlexWorld is a new model that expands 3D views (OS)
🎤 Audio > Sesame released CSM-1B new speech generation model (OS)
🤖 Robotics > NVIDIA released GR00T, new robotics model for generalized reasoning and skills, along with the dataset