TY - JOUR
T1 - Multiple Sound Source Localization, Separation, and Reconstruction by Microphone Array: A DNN-Based Approach
AU - Chen, Long
AU - Chen, Guitong
AU - Huang, Lei
AU - Choy, Yat Sze
AU - Sun, Weize
N1 - Funding Information:
This work was supported, in part, by the National Natural Science Foundation of China (NSFC) under Grants 62101335, U1713217, and 61925108; in part by the Natural Science Foundation of Guangdong under Grant 2021A1515011706, and in part by the Foundation of Shenzhen under Grant JCYJ20190808122005605.
Publisher Copyright:
© 2022 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2022/4/1
Y1 - 2022/4/1
N2 - Synchronistical localization, separation, and reconstruction for multiple sound sources are usually necessary in various situations, such as in conference rooms, living rooms, and supermarkets. To improve the intelligibility of speech signals, the application of deep neural networks (DNNs) has achieved considerable success in the area of time-domain signal separation and reconstruction. In this paper, we propose a hybrid microphone array signal processing approach for the nearfield scenario that combines the beamforming technique and DNN. Using this method, the challenge of identifying both the sound source location and content can be overcome. Moreover, the use of a sequenced virtual sound field reconstruction process enables the proposed approach to be quite suitable for a sound field which contains a dominant, stronger sound source and masked, weaker sound sources. Using this strategy, all traceable, mainly sound, sources can be discovered by loops in a given sound field. The operational duration and accuracy of localization are further improved by substituting the broadband weighted multiple signal classification (BW-MUSIC) method for the conventional delayand-sum (DAS) beamforming algorithm. The effectiveness of the proposed method for localizing and reconstructing speech signals was validated by simulations and experiments with promising results. The localization results were accurate, while the similarity and correlation between the reconstructed and original signals was high.
AB - Synchronistical localization, separation, and reconstruction for multiple sound sources are usually necessary in various situations, such as in conference rooms, living rooms, and supermarkets. To improve the intelligibility of speech signals, the application of deep neural networks (DNNs) has achieved considerable success in the area of time-domain signal separation and reconstruction. In this paper, we propose a hybrid microphone array signal processing approach for the nearfield scenario that combines the beamforming technique and DNN. Using this method, the challenge of identifying both the sound source location and content can be overcome. Moreover, the use of a sequenced virtual sound field reconstruction process enables the proposed approach to be quite suitable for a sound field which contains a dominant, stronger sound source and masked, weaker sound sources. Using this strategy, all traceable, mainly sound, sources can be discovered by loops in a given sound field. The operational duration and accuracy of localization are further improved by substituting the broadband weighted multiple signal classification (BW-MUSIC) method for the conventional delayand-sum (DAS) beamforming algorithm. The effectiveness of the proposed method for localizing and reconstructing speech signals was validated by simulations and experiments with promising results. The localization results were accurate, while the similarity and correlation between the reconstructed and original signals was high.
KW - beamforming
KW - deep neural network
KW - sound sources localization
KW - sound sources separation and reconstruction
UR - http://www.scopus.com/inward/record.url?scp=85127777247&partnerID=8YFLogxK
U2 - 10.3390/app12073428
DO - 10.3390/app12073428
M3 - Journal article
AN - SCOPUS:85127777247
SN - 2076-3417
VL - 12
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 7
M1 - 3428
ER -