next up previous contents
Next: Quantum Computing Up: res12 Previous: Signal Processing and Control   Contents

Information Retrieval

The semi-discrete decomposition, developed with Shmuel Peleg for image compression, has proved quite useful in latent semantic indexing, a method of document retrieval [C17] [J48].

Methods for document summarization based on hidden Markov models and matrix decompositions are studied in [J62]. We demonstrated the success of the methods for summarizing medical documents in [C26]. Our methods have been quite successful in the DUC (Document Understanding Conference) and TREC competitions [C20],[C21],[C22],[C25],[C27],[C28],[C29],[C32], and recently they performed as well as human summarizers in an evaluation on summarizing multi-lingual document sets [C30]; this shows that our summarizer is quite good, but also that the evaluation metrics are quite primitive [C35],[C39]! Further information about our summarization work is available in [C31],[C33],[C34].

A full retrieval system that processes a query, clusters the resulting documents, and creates summaries of each cluster is presented in [J82] and available at http://stiefel.cs.umd.edu:8080/qcs/

[C17]
Tamara G. Kolda and Dianne P. O'Leary, ``Latent Semantic Indexing via a Semi-Discrete Matrix Decomposition," in The Mathematics of Information Coding, Extraction and Distribution, George Cybenko, Dianne P. O'Leary, and Jorma Rissanen, eds., IMA Volumes in Math. and Its Applics., Springer-Verlag, New York, 1999, 73-80.
[C20]
J. M. Conroy, J. D. Schlesinger, D. P. O'Leary, and M. E. Okurowski, ``Using HMM and Logistic Regression to Generate Extract Summaries for DUC," DUC 01 Conference Proceedings, 2001. http://duc.nist.gov/
[C21]
Lynn Carlson, John M. Conroy, Daniel Marcu, Dianne P. O'Leary, Mary Ellen Okurowski, Anthony Taylor, and William Wong, ``An Empirical Study of the Relation between Abstracts, Extracts, and the Discourse Structure of Texts," DUC 01 Conference Proceedings, 2001. http://duc.nist.gov/
[C22]
J. D. Schlesinger, M. E. Okurowski, J. M. Conroy, D. P. O'Leary, A. Taylor, J. Hobbs, H. T. Wilson, ``Understanding Machine Performance in the Context of Human Performance for Multi-document Summarization," DUC 02 Conference Proceedings, 2002. http://duc.nist.gov/
[C25]
Daniel M. Dunlavy, John M. Conroy, Judith D. Schlesinger, Jade Goldstein, Sarah A. Goodman, Mary Ellen Okurowski, Dianne P. O'Leary, and Hans van Halteren, ``Performance of a Three-Stage System for Multi-Document Summarization," DUC 03 Conference Proceedings, 2003.
[C26]
Daniel M. Dunlavy, John M. Conroy, Timothy J. O'Leary, and Dianne P. O'Leary, ``Clustering and Summarizing Medline Abstracts", BISTI 2003 Symposium on Digital Biology: The Emerging Paradigm, National Institutes of Health Biomedical Information Science and Technology Initiative (BISTI), 2003.
[C27]
John M. Conroy, Judith D. Schlesinger, Jade Goldstein, and Dianne P. O'Leary, ``Left-Brain/Right-Brain Multi-Document Summarization," DUC 04 Conference Proceedings, 2004.
[C28]
John M. Conroy, Judith D. Schlesinger, Dianne P. O'Leary, and Jade Goldstein, ``Back to Basics: CLASSY 2006," DUC 06 Conference Proceedings, 2006. http://duc.nist.gov/
[C29]
David M. Zajic, Bonnie Dorr, J. Lin, Dianne P. O'Leary, John M. Conroy, and Judith D. Schlesinger, ``Sentence Trimming and Selection: Mixing and Matching," DUC 06 Conference Proceedings, 2006. http://duc.nist.gov/
[C30]
John M. Conroy, Dianne P. O'Leary, and Judith D. Schlesinger, ``CLASSY Arabic and English Multi-Document Summarization", in Multi-Lingual Summarization Evaluation 2006.
http://www.isi.edu/$\sim$cyl/MTSE2006/MSE2006/papers/index.html,
[C31]
John M. Conroy, Judith D. Schlesinger and Dianne P. O'Leary, ``Topic-Focused Multi-Document Summarization Using an Approximate Oracle Score", Proceedings of the ACL'06/COLING'06, 2006.
[C32]
John M. Conroy, Judith D. Schlesinger, and Dianne P. O'Leary, ``CLASSY 2007 at DUC 2007," Document Understanding Conference DUC 2007, HLT-NAACL, Rochester, NY, April 26, 2007.
[C33]
Nitin Madnani, Rebecca Passonneau, Necip Fazil Ayan, John M. Conroy, Bonnie J. Dorr, Judith L. Klavans, Dianne P. O'Leary, and Judith D. Schlesinger, ``Measuring Variability in Sentence Ordering for News Summarization," 11th European Workshop on Natural Language Generation (ENLG07) Schloss Dagstuhl, Germany, June 17-20, 2007.
[C34]
Judith D. Schlesinger, Dianne P. O'Leary, and John M. Conroy, ``Arabic/English Multi-document Summarization with CLASSY - The Past and the Future," CICLing Conference on Intelligent Text Processing and Computational Linguistics, Haifa, Israel, February 17-23, 2008. in Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer Science Volume 4919, Springer Berlin, (2008) 568-581. http://dx.doi.org/10.1007/978-3-540-78135-6_49
[C35]
John M. Conroy, Judith D. Schlesinger, and Dianne P. O'Leary, ``CLASSY 2009: Summarization and Metrics," TAC 2009 Workshop Proceedings, NIST, November 16-17, 2009. http://www.nist.gov/tac/publications/2009/participant.papers/CLASSY.proceedings.pdf
[C37]
John M. Conroy, Judith D. Schlesinger, Peter A. Rankel, and Dianne P. O'Leary, ``Guiding CLASSY toward More Responsive Summaries," TAC 2010 Workshop Proceedings, NIST, November 15-16, 2010. http://www.nist.gov/tac/2010/workshop/tem [[C39]] Peter Rankel, John M. Conroy, Eric V. Slud, and Dianne P. O’Leary, ``Ranking Human and Machine Summarization Systems," Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, UK, July 27–31, 2011. Association for Computational Linguistics (ACL) http://aclweb.org/anthology-new/D/D11/D11-1043.pdf
[J48]
Tamara G. Kolda and Dianne P. O'Leary, ``A Semi-Discrete Matrix Decomposition for Latent Semantic Indexing in Information Retrieval," ACM Transactions on Information Systems, 16 (1998) 322-346.
[J62]
Judith D. Schlesinger, John M. Conroy, Mary Ellen Okurowski, and Dianne P. O'Leary, ``Machine and Human Performance for Single- and Multi-Document Summarization," IEEE Intelligent Systems (special issue on Natural Language Processing) 18(1), 2003, 46-54.
[J82]
Daniel M. Dunlavy, Dianne P. O'Leary, John M. Conroy, and Judith D. Schlesinger, ``QCS: A System for Querying, Clustering, and Summarizing Documents," Information Processing and Management, 43:6 (2007), pp. 1588-1605. DOI:10.1016/j.ipm.2007.01.003


next up previous contents
Next: Quantum Computing Up: res12 Previous: Signal Processing and Control   Contents
Dianne O'Leary 2012-02-06