Neural Network-supported Dereverberation for Hearing Devices
This website contains supplementary material to the papers:
Listening examples
Full method comparison
GMAC/s | RTF* | 129.wav | 1201.wav | 1207.wav | 1249.wav | |
---|---|---|---|---|---|---|
T60 | 0.95 s | 0.71 s | 0.64 s | 0.92 s | ||
Clean | ||||||
Reverberant | ||||||
DNN-PF | 0.09 | 0.08 | ||||
GaGNet [2] | 2.34 | 0.80 | ||||
Oracle-PSD-WPE | 0.06 | 0.05 | ||||
DNN-WPE [3] | 0.14 | 0.13 | ||||
E2Ep-WPE [4] | 0.14 | 0.13 | ||||
DNN-WPE+DNN-PF | 0.22 | 0.20 | ||||
E2Ep-WPE+DNN-PF [1] | 0.22 | 0.20 |
* Real-Time Factor, defined as the ratio between the processing time and the utterance length, measured on an Intel(R) Core(TM) i7-9800X CPU
Video Demonstration
This video shows a demonstration of our algorithm E2Ep-WPE+DNN-PF in a real-life and real-time scenario. The speaker is first static, then moves. Our algorithm is able to yield high performance even in a dynamic setting. The total latency is 40ms, which is determined by the 32ms STFT synthesis window length (algorithmic delay) and the 8ms processing time contained within a STFT hop. Credits: Julius Richter.
References
[1] Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning, Timo Gerkmann. Neural-Network Two-Stage Algorithm for Lightweight Dereverberation on Hearing Devices, arXiv preprint arXiv:2204.02978, 2022.
[2] Andong Li, Chengshi Zheng, Lu Zhang, Xiaodong Li. Glance and Gaze: A Collaborative Learning Framework for Single-channel Speech Enhancement, Applied Acoustics, Elsevier, 2022.
[3] Jahn Heymann, Lukas Drude, Reinhold Haeb-Umbach, Keisuke Kinoshita, Tomohiro Nakatani. Frame-online DNN-WPE Dereverberation, IWAENC, 2018.
[4] Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning, Timo Gerkmann. Customizable End-to-End Optimization of Neural Network-supported Online Dereverberation for Hearing Devices, ICASSP, 2022.