Audio System For Technical Readings

T.V. Raman, DEC Cambridge Research Laboratories

Seminar on People, Computers, and Design
Stanford University October 28, 1994


The advent of electronic documents makes information available in more than its visual form ---electronic information can now be display-independent. We describe a computing system, AsTeR, that audio formats electronic documents to produce audio documents. AsTeR can speak both literary texts and highly technical documents (presently in La)TeX) that contain complex mathematics.

Visual communication is characterized by the eye's ability to actively access parts of a two-dimensional display. The reader is active, while the display is passive. This active-passive role is reversed by the temporal nature of oral communication: information flows actively past a passive listener. This prohibits multiple views ---it is impossible to first obtain a high-level view and then "look" at details. These shortcomings become severe when presenting complex mathematics orally.

Audio formatting, which renders information structure in a manner attuned to an auditory display, overcomes these problems. AsTeR is interactive, and the ability to browse information structure and obtain multiple views enables active listening.

T.V. Raman was born and raised in Pune, India. He was partially sighted (sufficient to be able to read and write) until he was 14. Raman received his B.A. in Mathematics at Nowrosjee Wadia College in Pune and his Masters in Math and Computer Science at the Indian Institute of Technology, Bombay. For his final-year project, he developed CONGRATS, a program that allowed the user to visualize curves by listening to them.

Many of the ideas on audio formatting mathematics come from his experiences in having math read to him, in dictating math exams and having them written by a writer, and in listening to RFB (Recordings for the Blind) books on tape.

Raman was introduced to computing in 1987 with an introductory course on programming in Fortran77. He did his computing with someone behind him to read the display. He joined the PhD program in Applied Math at Cornell in Fall 1989. and did his PhD Research on the topic of this talk. He is now doing research at DEC CRL on speech-based interfaces.


