Anthony culver's picture

Open to Work

Anthony culver

Makemericalshappen

·

AI & ML interests

None yet

Recent Activity

repliedto dystrio's post about 1 month ago

Dystrio Sculpt — Dense, smaller drop-in replacements for Mistral 7B, Llama 3.1 8B, and Qwen 2.5 7B** We built a structural compiler that produces smaller, dense models from existing checkpoints. No sparsity, no custom kernels, no new ops — output models load with standard `transformers`, work with vLLM, TGI, llama.cpp, and stack with AWQ/GPTQ/GGUF. Results (default tier, bf16, A100 80GB): [Mistral 7B Instruct v0.3 → sculpt-default](https://huggingface.co/dystrio/Mistral-7B-Instruct-v0.3-sculpt-default) — 11% smaller, PPL ratio 0.923 (quality improved), +10% prefill, -8% TTFT [Llama 3.1 8B Instruct → sculpt-default](https://huggingface.co/dystrio/Llama-3.1-8B-Instruct-sculpt-default) — 10% smaller, PPL ratio 1.064 (≈same), +8% prefill, -8% TTFT [Qwen 2.5 7B Instruct → sculpt-default](https://huggingface.co/dystrio/Qwen2.5-7B-Instruct-sculpt-default) — 9% smaller, PPL ratio 0.990 (quality improved), +7% prefill, -6% TTFT PPL ratio = WikiText-103 perplexity relative to the original. Below 1.0 means quality improved. More aggressive tiers available. Each model has 3-4 tiers trading quality for size — up to 30% smaller. Check the model cards for the full benchmark tables and tier comparisons. All models: [huggingface.co/dystrio](https://huggingface.co/dystrio)

repliedto dystrio's post about 1 month ago

Dystrio Sculpt — Dense, smaller drop-in replacements for Mistral 7B, Llama 3.1 8B, and Qwen 2.5 7B** We built a structural compiler that produces smaller, dense models from existing checkpoints. No sparsity, no custom kernels, no new ops — output models load with standard `transformers`, work with vLLM, TGI, llama.cpp, and stack with AWQ/GPTQ/GGUF. Results (default tier, bf16, A100 80GB): [Mistral 7B Instruct v0.3 → sculpt-default](https://huggingface.co/dystrio/Mistral-7B-Instruct-v0.3-sculpt-default) — 11% smaller, PPL ratio 0.923 (quality improved), +10% prefill, -8% TTFT [Llama 3.1 8B Instruct → sculpt-default](https://huggingface.co/dystrio/Llama-3.1-8B-Instruct-sculpt-default) — 10% smaller, PPL ratio 1.064 (≈same), +8% prefill, -8% TTFT [Qwen 2.5 7B Instruct → sculpt-default](https://huggingface.co/dystrio/Qwen2.5-7B-Instruct-sculpt-default) — 9% smaller, PPL ratio 0.990 (quality improved), +7% prefill, -6% TTFT PPL ratio = WikiText-103 perplexity relative to the original. Below 1.0 means quality improved. More aggressive tiers available. Each model has 3-4 tiers trading quality for size — up to 30% smaller. Check the model cards for the full benchmark tables and tier comparisons. All models: [huggingface.co/dystrio](https://huggingface.co/dystrio)

View all activity

Organizations

repliedto dystrio's post about 1 month ago

This comment has been hidden

repliedto dystrio's post about 1 month ago

This comment has been hidden