14 3 18

Loie PRO

Loie

https://loiesun.github.io/

AI & ML interests

computer vision, multi-modal representation learning and computational pathology

Recent Activity

published a model 1 day ago

Loie/SpotSound

published a dataset 1 day ago

Loie/SpotSound-Bench

submitted a paper 2 days ago

SpotSound: Enhancing Large Audio-Language Models with Fine-Grained Temporal Grounding

View all activity

Organizations

published a model 1 day ago

Loie/SpotSound

Updated 3 days ago • 1

published a dataset 1 day ago

Loie/SpotSound-Bench

Updated 3 days ago • 5

submitted a paper to Daily Papers 2 days ago

SpotSound: Enhancing Large Audio-Language Models with Fine-Grained Temporal Grounding

Paper • 2604.13023 • Published 4 days ago

updated a model 3 days ago

Loie/SpotSound

Updated 3 days ago • 1

updated a dataset 3 days ago

Loie/SpotSound-Bench

Updated 3 days ago • 5

liked a model 3 months ago

openai/whisper-large-v3

Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 4.81M • • 5.59k

liked a dataset 3 months ago

nvidia/LongAudio

Preview • Updated 12 days ago • 295 • 21

liked 2 datasets 11 months ago

m-a-p/OmniInstruct_v1

Viewer • Updated Mar 31, 2025 • 96.1k • 148 • 7

m-a-p/OmniBench

Viewer • Updated Jan 31, 2025 • 1.14k • 295 • 10

New activity in Loie/VGGSound 11 months ago

invalid compressed data

#8 opened over 1 year ago by

ZIHANGLIU

liked a dataset 11 months ago

Loie/KEEP_dataset

Updated Mar 31, 2025 • 34 • 2

liked a Space 11 months ago

Open LMM Subjective Leaderboard

🌎

VLMEvalKit Subjectivce Benchmark Results

New activity in Loie/VGGSound 11 months ago

Audio Data not Included

#9 opened 11 months ago by

zifuwan

upvoted a paper 12 months ago

Multi-Agent System for Comprehensive Soccer Understanding

Paper • 2505.03735 • Published May 6, 2025 • 25

updated a dataset about 1 year ago

Loie/KEEP_dataset

Updated Mar 31, 2025 • 34 • 2

liked a model about 1 year ago

Qwen/Qwen2.5-Omni-7B

Any-to-Any • Updated Apr 30, 2025 • 463k • 1.89k

upvoted a collection about 1 year ago

Qwen2.5-Omni

Collection

End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 6 items • Updated Mar 2 • 166

published a dataset about 1 year ago

Loie/KEEP_dataset

Updated Mar 31, 2025 • 34 • 2

updated a dataset about 1 year ago

Loie/Auto-ACD

Viewer • Updated Feb 6, 2025 • 1.92M • 162 • 24

liked a model over 1 year ago

OpenGVLab/InternVideo2-Stage2-6B-Audio

Updated Nov 27, 2024 • 2

Loie PRO

AI & ML interests

Recent Activity

Organizations

Loie's activity

invalid compressed data

Open LMM Subjective Leaderboard

Audio Data not Included