Voice quality in PCS and cellular networks: Eliminating the Echo
Improvements in digital mobile communications technology can usually aggravate problems, such as echo, that are unnoticeable in analog systems. The use of echo cancelers is one way to improve voice quality.
Wireless customers want their phones to sound as good as their landline phones. To meet this demand for “toll-quality” sound, service providers must increase voice quality throughout their systems. Changing from analog to digital modulation dramatically improved voice quality by reducing the noise associated with analog air interfaces. Digital systems, however, introduced circuit delays that made acoustic echo seem more prominent.
Some network providers have taken steps to resolve acoustic echo. Diligence is necessary because additional mobile network improvements that reduce noise even more will unmask lower levels of acoustic echo.
Echo Sources
In a typical wireless-to-wireline phone call in a PCS or digital cellular system (that is, from a wireless phone to a PSTN phone), two types of echo exist. Hybrid echo on the public switched telephone network (PSTN) end of the phone call is caused by the electronic reflection resulting from the four-wire to two-wire impedance mismatch. (See Figure 1 on page 26.) For echo to be noticeable, the human ear must detect some delay between the source signal (in this case, the spoken word) and the echo signal. In typical local-loop applications, this echo is not noticeable because the delay is so short that the human ear does not separate the original speech from the echo. Typical long-distance applications induce delay primarily through propagation, and thus they require hybrid echo cancelers for correct operation.
In a PCS network, however, propagation is a secondary issue because processing delay is always introduced into the propagation path through the network. This delay is reduced or eliminated in the mobile network, typically at the mobile switching center (MSC), according to GSM specifications that require cancelation “looking toward” the PSTN.
The International Telecommunications Union’s (ITU) Recommendation ITU-T G.168 (“Digital Network Echo Cancelers”) specifies the performance of echo cancelers and the test conditions for verifying performance in the PSTN. In wireless standards work, the GSM standards, as noted above, deal explicitly with hybrid echo cancelation.
Acoustic echo
The other previously mentioned echo phenomenon, acoustic echo, has become apparent in wireless networks.
Acoustic echo is defined as the coupling of received voice transmission between the earpiece and mouthpiece of a portable handset or the speaker and microphone of a hands-free mobile phone. When acoustic echo occurs, it is the PSTN user who is discomforted. (For simplicity, this discussion will refer to a wireless user and a PSTN user, even though the general case could include two wireless users.) Acoustic echo is a much more complex signal than hybrid echo. The simple case of a hands-free mobile phone illustrates the example most clearly. The received signal emits from the speaker and reflects from multiple surfaces inside an automobile. The reflections return the signal, at various time delays and amplitudes, into the microphone, and over the phone connection to the PSTN user’s ear. In addition, the tail circuit is non-linear because of the speech compression. Because these reflected signals are typically delayed 180ms or more, the PSTN user hears a perceptible echo.
Portable handsets couple the PSTN user’s voice between the earpiece and mouthpiece, as shown in Figure 2 on page 28. This occurs either directly, as handsets get smaller, via reflections off the user; or via reflections from the environment. This coupling is more prominent when the wireless user increases the handset volume, to compensate for high background noise and for PSTN users with soft voices.
Handset specification
GSM specifiers did not ignore acoustic echo when developing standards. As with any echo, the closer to the source it is dealt with, the more effective the solution. For acoustic echo, the source is the handset. Two aspects of handset specifications deal with echo loss performance characteristics. One is the handset performance, and the other is testing specifications. The weighted terminal coupling loss (TCLw) value of 46dB, derived from ITU’s Recommendation ITU-T G.131 (“Control of Talker Echo”), is the specified performance characteristic for handset acoustic echo return loss (AERL). This level, as evidenced by continuing discussions among the European Telecommunicatons Standards Institute’s (ETSI) special mobile groups, is not universally accepted as being correct or indicative of operational conditions across the voice bandwidth.
GSM Phase I testing methods are flawed, in that sinusoidal tone testing is allowed to be run through the speech coder/decoder (codec) to verify performance. The full-rate codecs cause sinusoidal spreading, which results in lower power when measured at discrete frequencies. Testing that bypasses the codec, as allowed by the specifications, is also flawed, because it precludes an element that degrades the GSM system. Both test approaches result in equipment that displays artificially high AERL levels that allow acoustic echo to leak through.
Phase II specifications are intended to improve this performance via an artificial voice test stimulus as specified by ITU’s Recommendation ITU-T P.50 (“Artificial Voices”). In conjunction with a proposed head-and-torso simulator (HATS) that incorporates free-air transmission, these test methods would attempt to more closely represent actual user conditions. There is still resistance to this approach, so the main improvement in testing in Phase II is expected to be the more indicative stimuli.
The effect of improved coders
Attempts to increase GSM system voice quality have resulted in extensive study and proposed alternate coding techniques. The enhanced full-rate (EFR) coder concept was put forward to improve the quality without imposing a penalty on bandwidth. The U.S.1 codec, adopted by ETSI, provides improved performance over the full-rate coder while using the full-rate codec, thus making deployment easier. The comparative quality is shown on a mean-opinion score (MOS) scale in Figure 3 on page 30. The graph shows statistically significant performance improvement in error-free conditions (EP0) through carrier-to-interference (C/I) conditions of as much as 7dB (EP2).
So where’s the problem?
Improved quality from the processing within the coding stage diminishes the masking of acoustic echo produced by the mobile handset. EFR coding, for instance, improves the channel’s performance but allows acoustic echo to be detected by the user. Mobile users are prone to increase their volume levels to overcome local background noise, exacerbating the acoustic echo problem by increasing the likelihood of coupling.
Echo cancelation
To solve this problem, many carriers limited the level of the mobile user. This decreases the acoustic coupling, but the subscribers (who are paying the bills) cannot hear the PSTN-side user. Another solution is to use echo suppressers in the switching equipment. Many manufacturers integrate suppressers into their systems, but the performance of these devices is limited, particularly with double-talk. (See “Echo canceling vs. echo suppression” on page 32.) The ideal solution to acoustic echo is the use of high-quality echo cancelers.
To be effective in a wireless environment, echo cancelers need critical features including effective, bidirectional, acoustic and hybrid echo cancelation. They must also be able to tune the tail circuit delay offset. Finally, they must be compact and support future audio-enhancing signal processing capabilities.
Effective acoustic echo canceling in digital wireless networks requires a different approach from conventional hybrid echo canceling because of the non-linear tail circuit. The implementation must account for these non-linearities and must remain stable upon convergence to provide seamless canceling. For echo canceling to be effective in this mode requires the significant processing capacity of a network processing canceler when in the presence of low-performance handsets.
Traditional echo cancelers are limited to 128ms tail lengths, largely due to cost. Processing delays in the tail circuit (between the MSC and the handset) of a mobile system will result in tail lengths approaching 300ms. To cancel effectively, a 64ms canceler needs to anticipate the non-linearity associated with this additional delay and cancel the echo where the reflections are most prominent. Initial convergence accounting for this tail circuit property will result in a more stable system. Photo 1 on page 24 shows voice prints of echo without and with cancelation.
Architecturally, ideal use of space in the MSC for an acoustic canceler would result from the use of a canceler that looks both into the wireless end and out at the PSTN end, in a single canceler package. Photo 2 on page 32 shows one such canceler. The alternative (independent cancelers that are wired back-to-back), requires additional money, space and power. Because of the unpredictability of the handset’s AERL capabilities and the variability of the caller environment, it is impractical to place a canceler in the circuit only when conditions require it.
Conclusion
Voice quality problems, particularly those attributable to handsets, reflect poorly on overall network quality. The ideal solution, stricter governance of handsets, is impractical both from an economic and technical perspective. The handset marketplace demands small size, high fidelity and low prices. Implementation of sophisticated processing to solve acoustic echo problems at this level is not likely because it would adversely affect handset size, weight and cost. From a technical point of view, specifying and testing handsets in many and varied environmental conditions with realistic test conditions and stimuli have been resisted to date.
It falls to network providers to implement solutions that deliver better voice quality to their customers. State-of-the-art acoustic or bidirectional hybrid and acoustic echo cancelers can provide that support today to wireless networks worldwide.
References
“Analysis of Noise Sources when Measuring Echo Loss of GSM-Mobiles with the GSM-Test System,” Special Mobile Group 7, European Telecommunications Standards Institute, Sophia Antipolis, France, TDoc SMG7 448/97.
“Artificial Voices,” ITU-T Recommendation P.50, International Telecommunications Union, Geneva, 1993.
“Control of Talker Echo,” ITU-T Recommendation G.131, International Telecommunications Union, Geneva, 1996.
“Digital Network Echo Cancelers,” ITU-T Recommendation G.168, International Telecommunications Union, Geneva, 1997.
Goetz, I., “Opportunities for Improving the Speech Quality of Digital Cellular Systems,” DMR VII Conference, October 1996.
“Liaison Statement-Problem with Echo Loss Testing,” Special Mobile Group 7, European Telecommunications Standards Institute, Sophia Antipolis, France, TDoc SMG7 553/97.
Mehrotra, A., “GSM System Engineering,” Artech House, Boston, 1997.
“Problem with Echo Loss Testing,” Special Mobile Group 11, European Telecommunications Standards Institute, Sophia Antipolis, France, TDoc SMG11 30/98.