view article Article "Darwin-27B-Opus: Surpassing the Foundation Model Without Training" FINAL-Bench โข about 1 month ago โข 13
view article Article Darwin V6: Diagnostic-Guided Evolutionary Model Merging FINAL-Bench โข Apr 8 โข 11
view article Article "The Child That Surpassed Both Parents Through MRI-Guided Evolutionary Merge" FINAL-Bench โข Mar 31 โข 14
view article Article Introducing WM Bench: A Benchmark for Cognitive Intelligence in World Models FINAL-Bench โข Mar 29 โข 13
view article Article MARL: Runtime Middleware That Reduces LLM Hallucination Without Fine-Tuning FINAL-Bench โข Mar 9 โข 16
view article Article Structural Problems in AI Benchmarking and the Case for a Unified Evaluation Framework FINAL-Bench โข Mar 8 โข 12
view article Article ๐๏ธ Smol AI WorldCup: A 5-Axis Benchmark That Reveals What Small Language Models Can Really Do FINAL-Bench โข Mar 10 โข 38