You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

  1. License and Restrictions
    This dataset and model checkpoints (collectively, the "Resources") are provided by the Proact-VL Team. The Resources are intended solely for open academic research purposes. Commercial use is strictly prohibited.
  2. No Redistribution
    Unauthorized redistribution is strictly prohibited. Users shall not re-upload, host, or distribute the Resources on any other public dataset platforms, mirror sites, or third-party cloud storage services. For sharing purposes, please direct users to the original project link: https://huggingface.co/collections/oaaoaa/aicompanion.
  3. Disclaimer
    The Resources are provided on an "as-is" basis. The Proact-VL Team makes no warranties, express or implied, regarding their completeness, accuracy, or fitness for a particular purpose. Users are solely responsible for ensuring their actions comply with applicable laws and ethical standards. The Proact-VL Team shall not be held liable for any direct or indirect consequences arising from the use of the Resources.
  4. Citation Requirement
    If you utilize these Resources in your research or academic publications, please acknowledge the Proact-VL Team by citing this project accordingly.

Log in or Sign Up to review the conditions and access this model content.

Usage

from proactvl.infer.multi_assistant_inference import MultiAssistantStreamInference

# config
ckpt_path = 'oaaoaa/proactvl_base_qwen3vl'
model_config = None
infer_config = {
    'max_kv_tokens': 16384,
    'assistant_num': 1, # assistant number
    'enable_tts': False,
    'state_threshold': 0.5,
}
generate_config = {
    'do_sample': True,
    'max_new_tokens': 12,
    'temperature': 0.7,
    'top_p': 0.9,
    'repetition_penalty': 1.15,
}
talker_config = None
device_id = 0


# load model
stream_infer = MultiAssistantStreamInference(model_config, ckpt_path, infer_config, generate_config, talker_config, f'cuda:{device_id}')

# set system prompt 
system_prompt = ('You are a live commentator for a League of Legends (LoL) match. '
'Your role is to independently analyze and narrate the game, delivering insightful, engaging, and natural commentary just like a human expert.')
stream_infer.assistants[0].prime_system_prompt(system_prompt)

# set video path here
video_path = './asset/sample.mp4'
video_begin = 0
video_end = 30
duration = video_end - video_begin
stream_infer.register_video_reader(video_path, video_begin, video_end)

overall_cc = {}
for t in range(duration):
    current_second = video_begin + t
    history = ''
    user_query = ''

    assistant_responses, _ = stream_infer.infer_one_chunk(current_second, history=history, user_query=user_query, previous_responses=None)
    if assistant_responses[0].active:
        commentary = assistant_responses[0].commentary.strip()
        overall_cc[current_second] = commentary if assistant_responses is not None else ''
        print(f'[Sec: {current_second}({assistant_responses[0].score})]: {commentary}')
    else:
        print(f'[Sec: {current_second}({assistant_responses[0].score})]: <|SILENCE|>')

print('Final Commentary:', overall_cc)
Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including oaaoaa/proactvl_base_qwen3vl