Spaces:
Runtime error
Runtime error
File size: 5,795 Bytes
8c2765a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
# Music Flamingo Code Flow
```mermaid
flowchart TD
Start([App Starts]) --> Init[Initialize App]
Init --> LoadModel[Load Music Flamingo Model<br/>processor & model from MODEL_ID]
LoadModel --> SetupProxy{Check for<br/>SSH Proxy?}
SetupProxy -->|Yes| CreateTunnel[Create SSH Tunnel]
SetupProxy -->|No| Ready[App Ready]
CreateTunnel --> Ready
Ready --> UI[Gradio UI Loaded]
UI --> UserInput{User Input}
UserInput -->|Upload Audio| AudioFile[Audio File Path]
UserInput -->|YouTube URL| YouTubeURL[YouTube URL String]
UserInput -->|Load Button| LoadYouTube[Load YouTube Audio]
LoadYouTube --> DownloadYT[download_youtube_audio]
DownloadYT --> CheckCache{URL in<br/>Cache?}
CheckCache -->|Yes & Exists| ReturnCached[Return Cached File]
CheckCache -->|No| ValidateURL[Validate YouTube URL<br/>with Regex]
ValidateURL -->|Invalid| Error1[Return Error Message]
ValidateURL -->|Valid| YTDL[yt-dlp Download]
YTDL --> ExtractAudio[Extract Audio to MP3]
ExtractAudio --> CacheFile[Cache File Path]
CacheFile --> ReturnFile[Return File Path]
ReturnCached --> AudioFile
ReturnFile --> AudioFile
AudioFile --> UserPrompt[User Enters Prompt]
UserPrompt --> ClickGenerate[Click Generate Button]
ClickGenerate --> Infer[infer Function]
Infer --> DetermineSource{Audio Source?}
DetermineSource -->|File Upload| UseFile[Use audio_path]
DetermineSource -->|YouTube| DownloadIfNeeded[Download if not cached]
DownloadIfNeeded --> UseFile
UseFile --> CreateConversation[Create Conversation Format]
CreateConversation --> FormatInput["conversations = [<br/> [{<br/> 'role': 'user',<br/> 'content': [<br/> {'type': 'text', 'text': prompt},<br/> {'type': 'audio', 'path': file}<br/> ]<br/> }]<br/>]"]
FormatInput --> ApplyTemplate[processor.apply_chat_template]
ApplyTemplate --> Tokenize[Tokenize Input]
Tokenize --> MoveToDevice[Move to model.device]
MoveToDevice --> Generate[model.generate<br/>max_new_tokens=4096]
Generate --> Decode[processor.batch_decode]
Decode --> FormatOutput[Format Result with Status]
FormatOutput --> Display[Display in Gradio UI]
Error1 --> Display
style Start fill:#90EE90
style LoadModel fill:#FFD700
style Generate fill:#FF6B6B
style Display fill:#4ECDC4
style Error1 fill:#FF6B6B
```
## Detailed Function Flow
### 1. Initialization Flow
```mermaid
sequenceDiagram
participant App
participant Model
participant Proxy
App->>Proxy: Check SSH environment variables
alt Proxy Available
Proxy->>Proxy: Create SSH tunnel
Proxy->>App: PROXY_URL set
end
App->>Model: Load processor from MODEL_ID
App->>Model: Load model with device_map="auto"
Model->>App: Model ready
App->>App: Launch Gradio UI
```
### 2. YouTube Download Flow
```mermaid
flowchart LR
A[YouTube URL] --> B{Valid URL?}
B -->|No| C[Return Error]
B -->|Yes| D{Cached?}
D -->|Yes| E{File Exists?}
E -->|Yes| F[Return Cached]
E -->|No| G[Download]
D -->|No| G
G --> H[yt-dlp Download]
H --> I[Extract to MP3]
I --> J[Cache File]
J --> K[Return Path]
style C fill:#FF6B6B
style F fill:#90EE90
style K fill:#90EE90
```
### 3. Model Inference Flow
```mermaid
sequenceDiagram
participant User
participant UI
participant Download
participant Processor
participant Model
User->>UI: Upload audio or YouTube URL
UI->>Download: Get audio file path
Download->>UI: Return file path
User->>UI: Enter prompt
User->>UI: Click Generate
UI->>Processor: Create conversation format
Processor->>Processor: apply_chat_template()
Processor->>Processor: Tokenize input
Processor->>Model: Send batch to device
Model->>Model: Generate tokens (max 4096)
Model->>Processor: Return token IDs
Processor->>Processor: batch_decode()
Processor->>UI: Return text result
UI->>User: Display response
```
## Key Functions
### download_youtube_audio()
```mermaid
flowchart TD
Start[download_youtube_audio] --> Validate[Validate URL with Regex]
Validate -->|Invalid| ReturnError[Return None, Error]
Validate -->|Valid| CheckCache{URL in Cache?}
CheckCache -->|Yes| CheckFile{File Exists?}
CheckFile -->|Yes| ReturnCached[Return Cached Path]
CheckFile -->|No| Download[Download Audio]
CheckCache -->|No| Download
Download --> YTDL[yt-dlp with Options]
YTDL --> Extract[Extract to MP3]
Extract --> Cache[Store in Cache]
Cache --> ReturnPath[Return Path, Status]
style ReturnError fill:#FF6B6B
style ReturnCached fill:#90EE90
style ReturnPath fill:#90EE90
```
### infer()
```mermaid
flowchart TD
Start[infer Function] --> GetAudio{Get Audio}
GetAudio -->|File Upload| UseFile[Use audio_path]
GetAudio -->|YouTube| DownloadYT[Download YouTube]
DownloadYT -->|Success| UseFile
DownloadYT -->|Error| ReturnError[Return Error]
UseFile --> CreateConv[Create Conversation]
CreateConv --> ApplyTemplate[Apply Chat Template]
ApplyTemplate --> Generate[Model Generate]
Generate --> Decode[Decode Output]
Decode --> Format[Format Result]
Format --> Return[Return Text]
style ReturnError fill:#FF6B6B
style Return fill:#90EE90
```
## Data Flow
```mermaid
flowchart LR
A[User Input] --> B{Input Type}
B -->|Audio File| C[File Path]
B -->|YouTube URL| D[Download Function]
D --> C
C --> E[Conversation Format]
E --> F[Processor]
F --> G[Model]
G --> H[Generated Text]
H --> I[UI Display]
style A fill:#4ECDC4
style G fill:#FF6B6B
style I fill:#90EE90
```
|