Jargonic Sets New Standards for Japanese ASR

Explore Benchmarks

Jargonic Sets New Standards for Japanese ASR

Explore Benchmarks

Target speaker extraction is extracting a specific speaker’s voice from a mixture of overlapping speech and background audio. In this work, we explore a simple yet effective approach to TSE using flow matching.