Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
microsoft
/
Phi-4-multimodal-instruct
like
1.55k
Follow
Microsoft
16.9k
Automatic Speech Recognition
Transformers
Safetensors
24 languages
phi4mm
text-generation
nlp
code
audio
speech-summarization
speech-translation
visual-question-answering
phi-4-multimodal
phi
phi-4-mini
custom_code
arxiv:
2503.01743
arxiv:
2407.13833
License:
mit
Model card
Files
Files and versions
xet
Community
84
Deploy
Use this model
main
Phi-4-multimodal-instruct
/
examples
/
what_is_the_traffic_sign_in_the_image.wav
nguyenbh
Add examples
bd4b39b
10 months ago
download
Copy download link
history
contribute
delete
Safe
741 kB
This file contains binary data. It cannot be displayed, but you can still
download
it.