Spaces:
Sleeping
Sleeping
π Multilingual Translator + Semantic Search (Enhanced)
This project is a smart multilingual translator web app that offers:
- β Automatic language detection
- π High-quality translation between Indian and foreign languages
- π§ Semantic search to find similar Sanskrit-based concepts
- π Optional BLEU score evaluation (with human reference)
- π Downloadable report summarizing the output
- π« Input length handling to avoid translation errors
Developed using Hugging Face Transformers, Sentence Transformers, FAISS, and Gradio β and deployable to Hugging Face Spaces.
β οΈ Input Limit Notice
Please enter up to 3 lines or 2000 characters maximum.
- If input is too long, the app will show an error and skip translation.
π Live Demo
π Click here to try the app on Hugging Face Spaces
π§ Features
| Feature | Description |
|---|---|
| Language Detection | Auto-identifies input language using xlm-roberta-base-language-detection |
| Translation | Uses Facebookβs NLLB-200-distilled-600M model |
| Semantic Search | Finds similar Sanskrit concepts using Sentence Transformers + FAISS |
| BLEU Score | Optional evaluation metric (if human reference is provided) |
| Semantic Plot | Horizontal bar chart for top 3 semantic similarity scores |
| Download Report | Creates a .txt file (includes all outputs + BLEU score) |
| Error Handling | Graceful messages for empty or long input |
π Supported Languages
| Code | Language |
|---|---|
| eng_Latn | English |
| hin_Deva | Hindi |
| tam_Taml | Tamil |
| tel_Telu | Telugu |
| san_Deva | Sanskrit |
| fra_Latn | French |
| spa_Latn | Spanish |
| deu_Latn | German |
| jpn_Jpan | Japanese |
| zho_Hans | Chinese |
| arb_Arab | Arabic |
π Downloadable Report
The app generates a .txt file containing:
- Detected source language
- Translated output
- Semantic matches (with similarity scores)
- BLEU score (if a human reference translation is given)
π§ Future Enhancements
- ποΈ Speech-to-text input support
- π Text-to-speech audio output
- πΈ OCR: Translate text from uploaded images
- π Add more Indian languages and transliteration features
π©βπ» Author
Jeevitha Meenakshisundaram
M.Sc. Data Science, SASTRA University
π License
This project is licensed under the MIT License.