Viewpoint: How ATM Operators Can Pick The Best 'Voice' for Machines

At some point over the next 12 to 18 months the Justice Department will publish additional Americans with Disabilities Act regulations mandating new automated teller machine capabilities to meet the needs of the visually impaired.

One of the most significant of the new requirements proposed, and one that is certain to be implemented, will specify that "Machines shall provide visual and audible instructions for operation."

While a number of banking companies have already begun to install talking ATMs, there are probably still fewer than 1,000 in operation in the United States. Most of them cannot speak the full slate of transactions the ATM is designed to perform, nor can they speak balance and receipt information. Such limitations will virtually guarantee that these ATMs will not satisfy the coming mandate.

There are currently two technologies capable of satisfying the proposed ADA requirements. The first, used in limited form by the talking ATMs already installed, uses the WAV file technology familiar to many home PC users. The second, just now coming to market, is text-to-speech synthesis technology (SST). Today financial institutions face a critical choice in determining which of these technologies to embrace.

Essentially, WAV files are digitized recordings of real human voices. They must be recorded in advance by a human performer and are relatively inflexible. To speak a simple amount, an ATM using this technology must string together individual files for each component of that amount.

For example, to inform a visually impaired customer that his or her balance is $1,342.22, the ATM must first invoke a WAV file for "one," then for "thousand," then for "three," then for "hundred," and so on down to "cents."

While automated devices can perform this operation very quickly, the reliance on WAV file technology requires that the ATM or its host maintain a complete set of recordings for each possible number, each transaction option, and each transaction screen shown on the terminal. If the operators wish to present languages other than English, then they must provide a complete, alternate set of files in the desired language.

The most serious limitation of this technology is that the machine cannot speak anything that the ATM operator has not previously anticipated and recorded. It cannot, for instance, accommodate dynamic information such as customer-generated account names. If a customer wants to label his savings account "Jamaica Vacation Trust Fund," then the ATM operator must either suppress the option or schedule a new recording session.

SST-enabled ATMs, by contrast, can turn any text that an embedded PC or host computer can generate into audible speech, either through software routines or dedicated chips incorporated into the machine's hardware. They can thus speak all the transactional and balance information that is presented visually on the screen or on printed receipts and coupons, as well as any dynamic information that the customer or bank may generate.

Because the technology does not require recordings of human voices, it eliminates the costs of "talent" and professional recording sessions, and the machines can operate in multiple languages.

The sole disadvantage of SST technology as opposed to WAV file technology is that the SST-generated voice sounds robotic. However, this does not affect the function of the machines, and market research indicates that the clarity of an ATM's voice is vastly more important to visually impaired customers than human-like timbre.

The advantages of SST technology will become most readily apparent as ATM operators seek to upgrade their machines by adding such capabilities as choice of language, Web-enabled sales and marketing, or on-screen advertising and interactive marketing.

With WAV file technology, ATM operators would be obliged to record new messages for every new option and feature, and then upload these files to all their individual machines. More flexible SST-based devices will be far easier and more economical to upgrade and maintain.

The deployment of talking ATMs can help banks generate more business with many other customer segments besides the blind. Voice technology will encourage ATM use among customers with dyslexia and related forms of learning disability, the illiterate and barely literate (a low-profile but surprisingly numerous group), and visitors or immigrants who may speak English but do not read it.

The new technology might even attract the elderly, a market segment known for preferring tellers to low-cost ATMs.

While the provisions of the coming ADA regulations are still under development, their technological implications are clear. Whatever the final details, the coming regulations will mandate significant new voice capabilities for ATMs. Smart bankers will take steps today to turn these requirements to their advantage.

Mr. Jackson, the chief technology officer for Triton Systems Inc. of Long Beach, Miss., has ben working with representatives from the National Federation of the Blind to design accessible off-premises ATM technology.

For reprint and licensing requests for this article, click here.
MORE FROM AMERICAN BANKER