Humans have an extraordinary capability to analyze a complex auditory scene. We can, for instance, easily recognize and understand a speaker despite background noise such as other voices around us. The human ability to resolve such a complex auditory scene greatly exceeds that of modern speech-recognition technology, which can follow a single speaker in a quiet environment but not in background noise. Humans are not alone in their ability to parse the auditory environment, however, for many animals have comparable capabilities of discerning communication signals.
An impressive body of research over the past few decades has elucidated the biophysical mechanisms whereby the inner ear encodes sound stimulation into neural signals as well as some of the principles by which these neural signals are subsequently processed in the auditory brainstem and cerebral cortex. Nevertheless, we still lack an understanding of how a complex auditory scene is decomposed into its individual, natural signals such as speech. Progress on this issue requires a conjunction of biophysical and neurobiological studies of the auditory system and information-theoretical analyses of the complex sound signals that the auditory system detects and processes.
This program aims to enable progress on this issue by bringing researchers on the biophysics and neurobiology of hearing together with those investigating the information theory of complex auditory signals. We expect that the combination of these two perspectives will foster novel and exciting collaborations between program participants and yield significant progress in the neurobiology of hearing and oral communication as well as in speech-recognition technology.
Key topics that will be addressed are:
Biophysics and active sensing in the inner ear
Psychoacoustics and human auditory modeling
Mathematical structure of natural sounds, auditory cognition, and processing in central brain areas
Speech comprehension and language development in children
Mathematical structure of speech and speech-recognition technology
Compressed sensing and sparse recovery modeling
Deep neural networks for speech recognition
Oral communication in songbirds and primates