Uncertainty Estimation for Sound Source Localization with Deep Learning

Rendong Pi, Xiang Yu

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

While significant progress has been made in the field of Sound Source Localization (SSL), the confidence and robustness of the localization results still remain low. Conducting uncertainty analysis can effectively alleviate this problem, since it provides a measure of the confidence level in the SSL results. In this work, we propose a novel framework for SSL that not only delivers the state-of-the-art localization performance, but also provides reliable uncertainty estimations. Our framework leverages a novel backbone architecture integrating a multi-head self-attention module to effectively capture spatial features through a self-attention mechanism. Additionally, our approach incorporates subjective theory to associate predictions obtained from the neural network with a Dirichlet distribution. This allows us to model the overall uncertainty by parameterizing the class probabilities of the positions of the sound source. To comprehensively evaluate the performance of the proposed method, extensive experiments were conducted using both simulated and real-world datasets. The results show that the proposed method can improve the SSL accuracy and enhance the neural network's reliability, even out-of-distribution samples can be handled effectively. The obtained accurate sound source positions and uncertainty estimations can be utilized in downstream audio-related tasks, such as enhancing the accuracy and reliability of sound event detection by incorporating uncertainty. This integration can assist robots in making more informed decisions by fusing information from multiple sources. Our code is available at https://github.com/Devin-Pi/uncertainty-estimation-for-ssl.

Original languageEnglish
JournalIEEE Transactions on Instrumentation and Measurement
DOIs
Publication statusAccepted/In press - 2024

Keywords

  • attention mechanism
  • deep learning
  • moving sound source localization
  • subjective logic theory
  • uncertainty estimation

ASJC Scopus subject areas

  • Instrumentation
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Uncertainty Estimation for Sound Source Localization with Deep Learning'. Together they form a unique fingerprint.

Cite this