He Huang's picture

22 2

He Huang

steveheh

·

AI & ML interests

None yet

Recent Activity

commented on an article 8 days ago

Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR

new activity about 2 months ago

nvidia/parakeet_realtime_eou_120m-v1:Doesn't work

updated a model 2 months ago

nvidia/parakeet_realtime_eou_120m-v1

View all activity

Organizations

commented on Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR 8 days ago

Yes it would generally work by adding to the end of the transcripts, but you need to make sure that the finetuning data have complete utterances/sentences, otherwise the EOU prediction will not be accurate. Also, to evaluate the EOU performance you will need to do force alignment on the finetuning data to get the timestamps for start-of-utterance and end-of-utterance. Note that ASR WER will degrade if finetuning data is small.

The finetuning scripts for EOU are still in a PR which is to be merged by early next month, but you can already use it at https://github.com/NVIDIA-NeMo/NeMo/pull/14740/files#diff-e0436d26c60ad81f641827fee4ba5785ba5dd79e67f488ab5b67c762767f6977

New activity in nvidia/parakeet_realtime_eou_120m-v1 about 2 months ago

Doesn't work

#4 opened 2 months ago by

updated a model 2 months ago

nvidia/parakeet_realtime_eou_120m-v1

Updated Dec 3, 2025 • 548 • 107

liked a model 2 months ago

nvidia/parakeet_realtime_eou_120m-v1

Updated Dec 3, 2025 • 548 • 107

New activity in nvidia/parakeet_realtime_eou_120m-v1 2 months ago

Is a Multilingual Model or one for French Planned?

#2 opened 2 months ago by

License

#1 opened 2 months ago by

published a model 3 months ago

nvidia/parakeet_realtime_eou_120m-v1

Updated Dec 3, 2025 • 548 • 107

New activity in nvidia/NVIDIA-Nemotron-Nano-9B-v2 5 months ago

Can we have more detailed instructions on installing dependencies?

#24 opened 5 months ago by

updated 3 models 11 months ago

nvidia/ssl_en_nest_xlarge_v1.0

Updated Feb 26, 2025 • 22 • 6

nvidia/ssl_en_nest_large_v1.0

Updated Feb 26, 2025 • 40 • 6

nvidia/stt_zh_conformer_transducer_large

Automatic Speech Recognition • Updated Feb 18, 2025 • 50 • 13

updated a Space over 1 year ago

Canary 1b

Transcribe and translate audio into text

New activity in nvidia/canary-1b almost 2 years ago

Update README.md

#22 opened almost 2 years ago by

Update README.md

#23 opened almost 2 years ago by

Update README.md

#15 opened almost 2 years ago by

liked a model almost 2 years ago

nvidia/canary-1b

Automatic Speech Recognition • Updated Dec 3, 2025 • 1.47k • 457

New activity in nvidia/canary-1b almost 2 years ago

Wrong Readme: "s2t_translation" instead of "ast" for translation

#18 opened almost 2 years ago by