Acoustic Echo Suppression in the Spectral Domain – A low-complexity solution suitable for Voice over IP with high sound quality and home PC-equipment as terminals
Opponent: Ricardo Santos
In real-time communication, using microphones and loudspeakers, some of the sounds from the loudspeaker will leak into the microphone and be transmitted back to where the utterance first was spoken. This will be perceived as a very disturbing echo and will reduce the ability to communicate. The traditional acoustic echo cancellers/suppressors are of high-complexity. As users are getting more and more used to high sound quality there is a need for a solution with less computational complexity. Furthermore, there is also a need for a low-complexity delay estimator that can handle the case when latency in the system suddenly changes. This can for instance happen when using cheap soundcards or when playback/recording is down prioritized by the operating system.
In this master’s thesis, commissioned by Ericsson AB in Kista, an alternative to traditional acoustic echo cancellers/suppressors has been developed in order to reduce the complexity. When tracking the delay, the idea is to use cross-correlation between the loudspeaker signal and the microphone signal. To reduce complexity, these signals are down sampled. The idea in the acoustic echo suppressor is to perform the calculations in the spectral domain using blocks of 20 ms, instead of sample-by-sample in the time-domain. To reduce complexity even more the calculations are done over perceptually critical bands. The algorithms have been tested on a PC with a common soundcard and Microsoft Windows 2000 as operating system.
The results show that both the delay estimator and the acoustic echo suppressor works but are parameter sensitive and need some intelligence to work perfectly. The advantages are that the delay estimation only has to be within 10 ms from the correct delay for the acoustic echo suppressor to work satisfyingly and that the acoustic echo suppressor can adapt its filter for some parts of the spectra even during double-talk.