Okan Kolak on Mining Quotations for Book Search

2012-12-05 5

Okan Kolak on Mining Quotations for Book Search
The Association for Computing Machinery - Association for Computing Machinery (ACM)
Scanning books, magazines, and newspapers has become an widespread activity because people believe that much of the worlds information still resides off-line. In general after these works are scanned they are indexed for search and processed to add links.In this talk we will describes a new approach to automatically add links by mining repeated passages. Our technique connects elements that are semantically rich, so strong relations are made. Moreover, link targets point within a work rather than to the entire work, facilitating navigation.Our system has been run on a digital library of over 1 million books, has been used by thousands of people, and has generated the worlds largest collection of quotations. We will also present a follow-on project based on the theory that authors copy passages from book to book because these quotations capture an idea particularly well: Jefferson on liberty; Stanton on womens rights; and Gibson on cyberpunk.Our Key Ideas prototype provides an interaction model where readers fluidly explore the library by viewing popular quotations on a particular key term, and follow links to quotations on related key terms - Association for Computing Machinery