CS547 Human-Computer Interaction Seminar  (Seminar on People, Computers, and Design)

Fridays 12:30-1:50 · Gates B01 · Open to the public
Previous | Next
Archive
Douglas W. Oard
University of Maryland
Searching Spoken Word Collections
February 28, 2003

Spoken word collections promise access to unique and compelling content, and most of the needed technology to realize that promise is now in place. Decreasing storage costs, increasing network capacity, and easy availability of software to exchange digital audio make possible physical access to spoken word collections at a previously unimaginable scale. Effective support for intellectual access -- the problem of finding what you are looking for -- is much more challenging, however. In this talk I will review the work that has been done on this problem at the Text Retrieval Conferences and the Topic Detection and Tracking evaluations, and I will present some results from a user study comparing present manual and automated approaches to indexing spoken word collections. I will then describe a unique resource, a collection of 116,000 hours of oral history interviews recorded in 32 languages in 67 countries, and explain how we are leveraging an unprecedented manual indexing effort to develop the ability to index similar materials automatically.



Doug Oard is an Associate Professor at the University of Maryland, College Park, with a joint appointment in the College of Information Studies and the Institute for Advanced Computer Studies. He is presently on sabbatical with the Natural Language Group at University of Southern California Information Sciences Institute in Marina Del Rey.

He holds a Ph.D. in Electrical Engineering from the University of Maryland, and his research interests center around the use of emerging technologies to support information seeking by end users. Dr. Oard's recent work has focused on cross-language information retrieval, retrieval from audio, data mining from text, and the exchange of ratings by networked users.