TY - GEN
T1 - DeepEar: Sound Localization with Binaural Microphones
AU - Yang, Qiang
AU - Zheng, Yuanqing
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Binaural microphones, referring to two microphones with artificial human-shaped ears, are pervasively used in humanoid robots and hearing aids improving sound quality. In many applications, it is crucial for such robots to interact with humans by finding the voice direction. However, sound source localization with binaural microphones remains challenging, especially in multi-source scenarios. Prior works utilize microphone arrays to deal with the multi-source localization problem. Extra arrays yet incur higher deployment costs and take up more space. However, human brains have evolved to locate multiple sound sources with only two ears. Inspired by this fact, we propose DeepEar, a binaural microphone-based localization system that can locate multiple sounds. To this end, we develop a neural network to mimic the acoustic signal processing pipeline of the human auditory system. Different from hand-crafted features used in prior works, DeepEar can automatically extract useful features for localization. More importantly, the trained neural networks can be extended and adapted to new environments with a minimum amount of extra training data. Experiment results show that DeepEar can substantially outperform the state-of-the-art deep learning approach, with a sound detection accuracy of 93.3% and an azimuth estimation error of 7.4 degrees in multisource scenarios.
AB - Binaural microphones, referring to two microphones with artificial human-shaped ears, are pervasively used in humanoid robots and hearing aids improving sound quality. In many applications, it is crucial for such robots to interact with humans by finding the voice direction. However, sound source localization with binaural microphones remains challenging, especially in multi-source scenarios. Prior works utilize microphone arrays to deal with the multi-source localization problem. Extra arrays yet incur higher deployment costs and take up more space. However, human brains have evolved to locate multiple sound sources with only two ears. Inspired by this fact, we propose DeepEar, a binaural microphone-based localization system that can locate multiple sounds. To this end, we develop a neural network to mimic the acoustic signal processing pipeline of the human auditory system. Different from hand-crafted features used in prior works, DeepEar can automatically extract useful features for localization. More importantly, the trained neural networks can be extended and adapted to new environments with a minimum amount of extra training data. Experiment results show that DeepEar can substantially outperform the state-of-the-art deep learning approach, with a sound detection accuracy of 93.3% and an azimuth estimation error of 7.4 degrees in multisource scenarios.
KW - Binaural localization
KW - Earable computing
KW - Multi-source localization
UR - http://www.scopus.com/inward/record.url?scp=85133293423&partnerID=8YFLogxK
U2 - 10.1109/INFOCOM48880.2022.9796850
DO - 10.1109/INFOCOM48880.2022.9796850
M3 - Conference article published in proceeding or book
AN - SCOPUS:85133293423
T3 - Proceedings - IEEE INFOCOM
SP - 960
EP - 969
BT - INFOCOM 2022 - IEEE Conference on Computer Communications
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 41st IEEE Conference on Computer Communications, INFOCOM 2022
Y2 - 2 May 2022 through 5 May 2022
ER -