A newer version of the Gradio SDK is available:
6.1.0
metadata
title: Stable Audio Open Small - 4 Variations
emoji: 🎵
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.20.0
app_file: app.py
pinned: false
license: other
Stable Audio Open Small - 4 Variations
Generate up to 4 audio variations from a single text prompt using Stability AI's Stable Audio Open Small model.
Model Information
Model: stabilityai/stable-audio-open-small
- Type: Latent diffusion model (DiT) with autoencoder
- Sample Rate: 44.1 kHz
- Format: Stereo audio
- Max Duration: 11 seconds
- License: Stability AI Community License
Features
- 4 Variations: Generate 4 different audio variations from a single prompt
- Text-to-Audio: Simple text prompt interface
- Variable Duration: Control audio length (1-11 seconds)
- Fast Generation: Uses optimized pingpong sampler with 8 steps
Setup
This model requires accepting the license agreement on Hugging Face. To use this Space:
- Accept the model license: Visit stabilityai/stable-audio-open-small and accept the license agreement
- Create an access token: Go to Settings > Access Tokens and create a token with "read" permissions
- Add token to Space: In your Space settings, go to "Variables and secrets" and add a new secret:
- Name:
HF_TOKEN - Value: Your access token
- Make sure it's marked as private
- Name:
Usage
- Enter a text prompt describing the audio you want to generate
- Adjust the duration slider (1-11 seconds)
- Click "Generate" to create 4 variations
- Listen to and download your favorite variations
Example Prompts
- "128 BPM tech house drum loop"
- "Ocean waves crashing on beach"
- "Jazz piano melody"
- "Rainforest ambience with bird calls"
- "Electronic synth pad"
Model Limitations
- The model is not able to generate realistic vocals
- Trained with English descriptions - may not perform as well in other languages
- Better at generating sound effects and field recordings than music
- Performance varies across different music styles and cultures
- Prompt engineering may be required for best results
Technical Details
- Steps: 8 (optimized for speed)
- CFG Scale: 1.0
- Sampler: pingpong
- Batch Size: 4 (for generating variations)
License
This Space uses the Stability AI Community License. For commercial use, please refer to stability.ai/license.
Model Card
For more information about the model, training data, and limitations, see the model card.