Video-Text-to-Text
Transformers
Safetensors
internvl_chat
multimodal
video-understanding
temporal-localization
qwen
custom_code
Instructions to use UserJoseph/DisTime-1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use UserJoseph/DisTime-1B with Transformers:
# Load model directly from transformers import InternVLChatModelTime model = InternVLChatModelTime.from_pretrained("UserJoseph/DisTime-1B", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Add comprehensive model card for DisTime
#1
by nielsr HF Staff - opened
This PR significantly enhances the model card by:
- Adding the
pipeline_tag: video-text-to-text, allowing the model to be discovered under relevant filters on the Hub. - Specifying
library_name: transformers, enabling the "How to use" widget for easier inference. - Adding relevant
tagssuch asmultimodal,video-understanding,temporal-localization, andqwenfor improved discoverability and context. - Linking directly to the Hugging Face paper page: DisTime: Distribution-based Time Representation for Video Large Language Models.
- Providing a link to the official GitHub repository for code and further details.
- Including the full abstract and a clear
transformers-based usage example for quick understanding and implementation. - Adding the citation information and acknowledgements.
- Removing the unnecessary "File information" section.
UserJoseph changed pull request status to merged