In this work, we introduce a new framework for confidentiality preserving rank-ordered search and retrieval over large document collections. The proposed framework not only protects document/query confidentiality against an outside intruder, but also prevents an untrusted data center from learning information about the query and the document collection. We present practical techniques for proper integration of relevance scoring methods and cryptographic techniques, such as order preserving encryption, to protect data collections and indices and provide efficient and accurate search capabilities to securely rank-order documents in response to a query. Experimental results on the W3C collection show that these techniques have comparable performance to conventional search systems designed for non-encrypted data in terms of search accuracy. The proposed methods thus bring together advanced information retrieval and secure search capabilities for a wide range of applications including managing data in government and business operations, enabling scholarly study of sensitive data, and facilitating the document discovery process in litigation.
Ashwin Swaminathan received the B.Tech degree in Electrical Engineering from the Indian Institute of Technology, Madras, India in 2003. He is currently pursuing the Ph.D. degree in signal processing and communications at the Department of Electrical and Computer Engineering at the University of Maryland, College Park.
His research interests include information security and multimedia forensics. In 2005, his paper on multimedia security was selected as the winner of the Student Paper Contest at the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP'05). More details can be found on his website at www.ece.umd.edu/~ashwins.
This talk is part of the CLIP Colloquium Series, organized by Jimmy Lin (jimmylin -at- umd .dot. edu). For the complete schedule, please visit http://www.umiacs.umd.edu/research/CLIP/colloq/.