efficient-speech
/

lite-whisper-base

Automatic Speech Recognition

feature-extraction

hf-asr-leaderboard

Model card Files Files and versions

lite-whisper-base / README.md

eyoel-gebre's picture

Update README.md

a162477 verified about 1 year ago

|

history blame contribute delete

3.11 kB

	---
	base_model: openai/whisper-base
	library_name: transformers
	license: apache-2.0
	pipeline_tag: automatic-speech-recognition
	tags:
	- audio
	- automatic-speech-recognition
	- whisper
	- hf-asr-leaderboard
	---

	<!-- Provide a quick summary of what the model is/does. -->

	Lite-Whisper is a compressed version of OpenAI Whisper with LiteASR. See our [GitHub repository](https://github.com/efeslab/LiteASR) and [paper](https://arxiv.org/abs/2502.20583) for details.

	## Benchmark Results

	Following is the average word error rate (WER) evaluated on the [ESB datasets](https://huggingface.co/datasets/hf-audio/esb-datasets-test-only-sorted):

	\| Model \| Average WER (↓) \| Encoder Size \| Decoder Size \|
	\|-------\|----------------\|--------------\|--------------\|
	\| [whisper-tiny](https://huggingface.co/openai/whisper-tiny) \| 22.01 \| 7.63M \| 29.55M \|
	\| [lite-whisper-tiny-acc](https://huggingface.co/efficient-speech/lite-whisper-tiny-acc) \| 22.97 \| 7.41M \| 29.55M \|
	\| [lite-whisper-tiny](https://huggingface.co/efficient-speech/lite-whisper-tiny) \| 23.95 \| 7.00M \| 29.55M \|
	\| [lite-whisper-tiny-fast](https://huggingface.co/efficient-speech/lite-whisper-tiny-fast) \| 27.09 \| 6.48M \| 29.55M \|
	\|   \|   \|   \|   \|
	\| [whisper-base](https://huggingface.co/openai/whisper-base) \| 17.67 \| 19.82M \| 52.00M \|
	\| [lite-whisper-base-acc](https://huggingface.co/efficient-speech/lite-whisper-base-acc) \| 19.07 \| 18.64M \| 52.00M \|
	\| [lite-whisper-base](https://huggingface.co/efficient-speech/lite-whisper-base) \| 19.71 \| 17.44M \| 52.00M \|
	\| [lite-whisper-base-fast](https://huggingface.co/efficient-speech/lite-whisper-base-fast) \| 23.05 \| 16.07M \| 52.00M \|
	\|   \|   \|   \|   \|
	\| [whisper-small](https://huggingface.co/openai/whisper-small) \| 15.89 \| 87.00M \| 153.58M \|
	\| [lite-whisper-small-acc](https://huggingface.co/efficient-speech/lite-whisper-small-acc) \| 15.37 \| 76.99M \| 153.58M \|
	\| [lite-whisper-small](https://huggingface.co/efficient-speech/lite-whisper-small) \| 14.96 \| 70.16M \| 153.58M \|
	\| [lite-whisper-small-fast](https://huggingface.co/efficient-speech/lite-whisper-small-fast) \| 14.92 \| 63.11M \| 153.58M \|
	\|   \|   \|   \|   \|
	\| [whisper-medium](https://huggingface.co/openai/whisper-medium) \| 15.12 \| 305.68M \| 456.64M \|
	\| [lite-whisper-medium-acc](https://huggingface.co/efficient-speech/lite-whisper-medium-acc) \| 13.46 \| 269.93M \| 456.64M \|
	\| [lite-whisper-medium](https://huggingface.co/efficient-speech/lite-whisper-medium) \| 14.50 \| 239.99M \| 456.64M \|
	\| [lite-whisper-medium-fast](https://huggingface.co/efficient-speech/lite-whisper-medium-fast) \| 14.52 \| 215.31M \| 456.64M \|


	## Citation

	If you use LiteASR in your research, please cite the following paper:

	```
	@misc{kamahori2025liteasrefficientautomaticspeech,
	title={LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation},
	author={Keisuke Kamahori and Jungo Kasai and Noriyuki Kojima and Baris Kasikci},
	year={2025},
	eprint={2502.20583},
	archivePrefix={arXiv},
	primaryClass={cs.LG},
	url={https://arxiv.org/abs/2502.20583},
	}
	```