Persistent and skimmable voice-based information

One limitation of the voice-based interface is that it is not skimmable and the information does not persist over time. The voice utterance is ephemeral and listening to the recording again is time-consuming. How do we make auditory display more skimmable and persistent?

Some motivating examples are here.

  • Imagine you read a paragraph in a book vs. heard a paragraph in an audiobook. How long would it take to get a gist of it by reviewing it again.
  • Skimming a structure of program code on a screen vs. screen reader software.
  • Seeing a picture from a distance. Can we have something equivalent for a piece of music?


  • Background auditory display that can sonify where they are in a paragraph (building anticipation for structure)

Leave a Reply

Your email address will not be published. Required fields are marked *