Spaces:
Sleeping
Sleeping
File size: 1,358 Bytes
f4b6d22 980c187 e4e6a48 f4b6d22 980c187 f4b6d22 980c187 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
---
title: Urdu STT with GPT-OSS
emoji: ποΈ
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: High-quality Urdu STT with Faster-Whisper and LLM.
---
# ποΈ Faster Urdu ASR
This Space provides **state-of-the-art Urdu Automatic Speech Recognition (ASR)** built on [Faster-Whisper](https://github.com/guillaumekln/faster-whisper), fine-tuned for Urdu.
In addition to transcription, it offers **optional polishing with Groqβs `openai/gpt-oss-120b` LLM** to improve Urdu grammar, punctuation, and fluency.
## β¨ Features
- π€ **Audio input** via upload or direct microphone recording
- π Multiple output formats: plain text, `.srt`, `.vtt`, `.json`
- β‘ Built on **Faster-Whisper (CT2)** for efficient GPU/CPU inference
- π€ **Optional LLM polishing** with Groq API for natural, improved Urdu text
- π Works with environment variable `GROQ_API_KEY` or via UI input
## π Usage
1. Upload or record an Urdu audio file.
2. Choose output format (`text`, `srt`, `vtt`, `json`).
3. (Optional) Enable **LLM Polishing** to improve transcription quality.
- Provide a valid **`GROQ_API_KEY`** if not set in your environment.
- Adjust temperature and system prompt as needed.
4. Click **Transcribe** and view/download your results.
|