Add pipeline tag

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -1,9 +1,9 @@
1
  ---
2
- license: other
3
- license_name: license-term-of-universal-audio-tokenizer
4
  language:
5
  - en
6
  - zh
 
 
7
  tags:
8
  - audio
9
  - audio-tokenizer
@@ -11,11 +11,12 @@ tags:
11
  - speech
12
  - sound
13
  - music
 
14
  ---
 
15
  # Universal Audio Tokenizer: Empowering Semantic Speech Tokenizers with General Audio Perception
16
 
17
- **Universal Audio Tokenizer** is a compact single-codebook audio tokenizer that unifies general audio perception and
18
- linguistic alignment for downstream Audio-LLMs.
19
 
20
  ๐Ÿ“„ [Paper](https://arxiv.org/abs/2605.31521) | ๐Ÿ’ป [GitHub](https://github.com/Tencent/Universal_Audio_Tokenizer)
21
 
@@ -108,6 +109,7 @@ Also, you can directly run the inference code snippet below:
108
  ```python
109
  import os
110
  import torch
 
111
  from transformers import WhisperFeatureExtractor
112
  from src.model.modeling_whisper import WhisperVQEncoder
113
  from src.model.flow_inference import AudioDecoder
@@ -167,7 +169,7 @@ Our Universal Audio Tokenizer achieves high-quality speech reconstruction with a
167
 
168
  ### Superior Downstream Audio-LLM Performance
169
 
170
- When integrated with the Qwen2.5 LLM backbone, our Universal Audio Tokenizer yields superior performance on a wide range of downstream audio understanding benchmarks and controllable TTS synthesis tasks, demonstrating its effectiveness as a unified audio input/output interface for Audio-LLMs.
171
 
172
  #### Audio Understanding
173
 
@@ -208,4 +210,4 @@ If you find our code or model useful for your research, please cite:
208
 
209
  ## License
210
 
211
- This project is licensed under the [License Term of Universal_Audio_Tokenizer](LICENSE).
 
1
  ---
 
 
2
  language:
3
  - en
4
  - zh
5
+ license: other
6
+ license_name: license-term-of-universal-audio-tokenizer
7
  tags:
8
  - audio
9
  - audio-tokenizer
 
11
  - speech
12
  - sound
13
  - music
14
+ pipeline_tag: audio-to-audio
15
  ---
16
+
17
  # Universal Audio Tokenizer: Empowering Semantic Speech Tokenizers with General Audio Perception
18
 
19
+ **Universal Audio Tokenizer** (UniAudio-Token) is a compact single-codebook audio tokenizer that unifies general audio perception and linguistic alignment for downstream Audio-LLMs.
 
20
 
21
  ๐Ÿ“„ [Paper](https://arxiv.org/abs/2605.31521) | ๐Ÿ’ป [GitHub](https://github.com/Tencent/Universal_Audio_Tokenizer)
22
 
 
109
  ```python
110
  import os
111
  import torch
112
+ from huggingface_hub import snapshot_download
113
  from transformers import WhisperFeatureExtractor
114
  from src.model.modeling_whisper import WhisperVQEncoder
115
  from src.model.flow_inference import AudioDecoder
 
169
 
170
  ### Superior Downstream Audio-LLM Performance
171
 
172
+ When integrated with the Qwen2.5 LLM backbone, our Universal Audio Tokenizer yields superior performance on a wide range of downstream audio understanding benchmarks and controllable TTS synthesis tasks.
173
 
174
  #### Audio Understanding
175
 
 
210
 
211
  ## License
212
 
213
+ This project is licensed under the [License Term of Universal_Audio_Tokenizer](LICENSE).