Reed-Jones, J ORCID: 0000-0002-6398-1980
(2025)
Deep Binaural Direction of Arrival Estimation.
Doctoral thesis, Liverpool John Moores University.
Preview |
Text
2025reedjonesphd 1.pdf - Published Version Available under License Creative Commons Attribution Non-commercial. Download (4MB) | Preview |
Abstract
The objective of binaural direction of arrival (DoA) estimation is to find the DoA of a sound source by measuring the sound field with a binaural array. This field increasingly applies deep learning to this task, particularly convolutional neural networks which are trained on relatively raw representations of the binaural audio. This work investigates the field, establishing common trends among different publications, particularly in the data preparation, scrutinising these trends for instances of the emergence of collective wisdom without empirical backing. Based on this, an experimental evaluation is performed to gain insight into the efficacy of different existing and novel techniques, based on a recurring testing framework. Such experimental evaluations are undertaken for several topics: an analysis of acoustic conditions on the performance of binaural DoA estimation, a broad empirical study on binaural feature representations to be used with convolutional neural networks (CNNs), the proposal and comparison of convolutional recurrent neural network (CRNN) models for binaural DoA estimation, and an investigation into binaural DoA estimation in the mismatched anechoic condition; referring to a mismatch in headrelated transfer function (HRTF) measurements between training and testing datasets for an identical binaural array. The findings in this thesis lead to recommendations for more effectively using deep neural networks for binaural DoA estimation, while also demonstrating the limited ability of such systems to generalise to unseen binaural data when using simulated binaural datasets which are limited in their scope.
Item Type: | Thesis (Doctoral) |
---|---|
Uncontrolled Keywords: | Binaural Audio; Deep Learning; Sound Source Localisation; Acoustic Signal Processing; Human Audition |
Subjects: | T Technology > TA Engineering (General). Civil engineering (General) |
Divisions: | Engineering |
Date of acceptance: | 9 June 2025 |
Date of first compliant Open Access: | 6 August 2025 |
Date Deposited: | 06 Aug 2025 11:47 |
Last Modified: | 06 Aug 2025 11:47 |
DOI or ID number: | 10.24377/LJMU.t.00026709 |
Supervisors: | Jones, K, Fergus, P and Ellis, D |
URI: | https://researchonline.ljmu.ac.uk/id/eprint/26709 |
![]() |
View Item |