# LMMs Eval Documentation Welcome to the documentation for `lmms-eval` - a unified evaluation framework for Large Multimodal Models! This framework enables consistent and reproducible evaluation of multimodal models across various tasks and modalities including images, videos, and audio. ## Overview `lmms-eval` provides: - Standardized evaluation protocols for multimodal models - Support for image, video, and audio tasks - Easy integration of new models and tasks - Reproducible benchmarking with shareable configurations Majority of this documentation is adapted from [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness/) ## Table of Contents * **[Commands Guide](commands.md)** - Learn about command line flags and options * **[Model Guide](model_guide.md)** - How to add and integrate new models * **[Task Guide](task_guide.md)** - Create custom evaluation tasks * **[Current Tasks](current_tasks.md)** - List of all supported evaluation tasks * **[Run Examples](run_examples.md)** - Example commands for running evaluations * **[Caching](caching.md)** - Enable and reload results from the JSONL cache * **[Version 0.3 Features](lmms-eval-0.3.md)** - Audio evaluation and new features * **[Throughput Metrics](throughput_metrics.md)** - Understanding performance metrics ## Additional Resources * For dataset formatting tools, see [lmms-eval tools](https://github.com/EvolvingLMMs-Lab/lmms-eval/tree/main/tools) * For the latest updates, visit our [GitHub repository](https://github.com/EvolvingLMMs-Lab/lmms-eval)