Skip to main content


Action Science Explorer
Tools for Rapid Understanding of Scientific Literature

Jump to: Latest News | Description | Videos | Participants | Software | Publications | Project Links | Related Projects | Other NSF Grant Publications

Latest News

January 2012. Our paper on Action Science Explorer was accepted by JASIST, the Journal of the American Society for Information Science and Technology. See the Publications section below for more details.

December 2011. Action Science Explorer was featured in a NSF Discoveries report, "A New Visualization Method Makes Research More Organized and Efficient" (pdf).

July 2010. The iOpener Workbench has been renamed Action Science Explorer (ASE).

Description

The goal of the iOpener project is to generate readily-consumable surveys of different scientific domains and topics, targeted to different audiences and levels. Weve created an infrastructure for automatic summarization of research domains that links bibliometric lexical link mining, summarization techniques, and visualization tools. Part of this is the Action Science Explorer (ASE), a new tool which presents the academic literature for a field using many different modalities: lists of articles, their full texts, automatic text summaries, and visualizations of the structure of the citation network.

Action Science Explorer is partially an integration of two powerful existing tools the SocialAction network analysis tool and the JabRef reference manager. SocialAction provides us with powerful network analysis capabilities including force-directed citation network visualization, ranking and filtering papers by statistical measures, scatterplots of paper attributes and statistics, categorical and numerical range coloring, and automatic cluster detection. Using visualizations of the citation network we can easily find unexpected trends, clusters, gaps and outliers. Additionally, visualizations can immediately identify invalid data that is easily missed in tabular views.

JabRef supplies all the features one would expect from a reference manager, including searching using simple regular expressions, automatic and manual grouping of papers, DOI and URL links, PDF full text with annotations, abstracts, user generated reviews and text annotations, and many ways of exporting. It integrates with Microsoft Word, OpenOffice.org, and LaTeX/BibTeX, which allows quick adding of citations to discovered articles when writing survey papers.

These tools are linked together to form multiple coordinated views of the data. Clicking on a node in the citation network selects it and its corresponding paper in the reference manager, displaying its abstract, review, and other data associated with it. Moreover, when clusters of nodes are selected their papers are floated to the top of the reference manager for easy perusal. The inverse is true as well, with any paper, group, or search term selected in the reference manager highlighting the corresponding nodes in the network.

There are other coordinated views that provide the user with other aspects of the field. When any node or cluster is selected, the In-Cite Text window displays the text of all incoming citations to the paper(s), i.e. the whole sentences from the citing papers that include the citation to the selected paper(s). These are displayed in a hyperlinked list that allows the user to select any one of them to show their surrounding context in the Out-Cite Text window. This window shows the full text of the paper citing one of the selected papers, with highlighting showing the selected citation sentence as well as any other sentences that include hyperlinked citations to other papers. The last view is the summary window, which can contain various multi-document summaries of a selected cluster. Using automatic summarization techniques, we can summarize all of the incoming citations to papers within that cluster, hopefully providing key insights into that research community.

Action Science Explorer integrates these many components in order to provide a tool that supports rapid understanding of scientific literature. Users can analyze the network of citations between papers, identify key papers and research clusters, automatically summarize them, dig into the full text of articles to extract context, make annotations, write reviews, and finally export their findings in many of document authoring formats. We hope this infrastructure will enable users to generate readily-consumable surveys of scientific fields.

Data & Summarization

As part of the iOpener project we developed the ACL Anthology Network (AAN) reference dataset, which includes more than 16,000 papers, each distinguished with a unique ACL ID, together with their full-texts, abstracts, and citation information. It also includes other valuable meta-data such as author affiliations, citation and collaboration networks, and various centrality measures. Moreover, we automatically extracted all sentences in the collection that cite a particular target paper in order to create a citation context for it. We used this dataset to evaluate the effectiveness of ASE through our user studies, as well as to quantitatively test our text summarization approaches.

We developed several techniques to summarize scientific literature, including C-LexRank, a novel graph-based method for the automatic creation of citation-based summaries of the target papers. C-LexRank is built on top of the Clairlib library and uses network community detection to identify the main contributions of the target papers and then produces a summary that highlights these contributions. Using numerical analysis and evaluations, we showed that the surveys generated using the citation network model of scientific literature have higher quality than summaries that only use abstracts and those generated using other state of the art summarization systems.

Video Demonstration

Below is a video demonstration of ASE in action.

Participants

Software

ASE is a research prototype meant to provide inspiration for developers: the ASE demonstrates the value of integrating reference management, statistics, citation context extraction, natural language summarization for single and multiple documents, filters to interactively select key papers, and network visualization to see citation patterns and identify clusters. ASE currently requires substantial data processing for many of the views, as much of the data required is not available from publisher databases.

ASE is available only to our collaborators. Those who wish to explore publication and other databases should review the currently available commercial, open source, and research tools listed below.

This work has been partially supported by the National Science Foundation grant "iOPENER: A Flexible Framework to Support Rapid Learning in Unfamiliar Research Domains", jointly awarded to University of Maryland and University of Michigan as IIS 0705832.

Publications

Books and Other One-Time Publications
Dunne, C., Shneiderman, B., Gove, R., Klavans, J. & Dorr, B. (2012), "Rapid understanding of scientific paper collections: integrating statistics, text analytics, and visualization", JASIST: Journal of the American Society for Information Science and Technology.
Abstract: Keeping up with rapidly growing research fields, especially when there are multiple interdisciplinary sources, requires substantial effort for researchers, program managers, or venture capital investors. Current theories and tools are directed at finding a paper or website, not gaining an understanding of the key papers, authors, controversies, and hypotheses. This report presents an effort to integrate statistics, text analysis, and visualization in a multiple coordinated window environment that supports exploration. Our prototype system, Action Science Explorer (ASE), provides an environment for demonstrating principles of coordination and conducting iterative usability tests of them with interested and knowledgeable users. We developed an understanding of the value of reference management, statistics, citation context extraction, natural language summarization for single and multiple documents, filters to interactively select key papers, and network visualization to see citation patterns and identify clusters. The three-phase usability study guided our revisions to ASE and led us to improve the testing methods.
BibTeX:
@article{Dunne12Rapidunderstandingscientific,
  author = {Cody Dunne and Ben Shneiderman and Robert Gove and Judith Klavans and Bonnie Dorr},
  title = {Rapid understanding of scientific paper collections: integrating statistics, text analytics, and visualization},
  journal = {JASIST: Journal of the American Society for Information Science and Technology},
  year = {2012},
  url = {http://www.cs.umd.edu/localphp/hcil/tech-reports-search.php?number=2011-16}
}
Gove, R., Dunne, C., Shneiderman, B., Klavans, J. & Dorr, B. (2011), "Evaluating visual and statistical exploration of scientific literature networks", In VL/HCC '11: Proc. 2011 IEEE Symposium on Visual Languages and Human-Centric Computing.
Abstract: Action Science Explorer (ASE) is a tool designed to support users in rapidly generating readily consumable summaries of academic literature. It uses citation network visualization, ranking and filtering papers by network statistics, and automatic clustering and summarization techniques. We describe how early formative evaluations of ASE led to a mature system evaluation, consisting of an in-depth empirical evaluation with four domain experts. The evaluation tasks were of two types: predefined tasks to test system performance in common scenarios, and user-defined tasks to test the system's usefulness for custom exploration goals. The primary contribution of this paper is a validation of the ASE design and recommendations to provide: easy-to-understand metrics for ranking and filtering documents, user control over which document sets to explore, and overviews of the document set in coordinated views along with details-on-demand of specific papers. We contribute a taxonomy of features for literature search and exploration tools and describe exploration goals identified by our participants.
BibTeX:
@inproceedings{Gove11Evaluatingvisualand,
  author = {Robert Gove and Cody Dunne and Ben Shneiderman and Judith Klavans and Bonnie Dorr},
  title = {Evaluating visual and statistical exploration of scientific literature networks},
  booktitle = {VL/HCC '11: Proc. 2011 IEEE Symposium on Visual Languages and Human-Centric Computing},
  year = {2011},
  url = {http://www.cs.umd.edu/localphp/hcil/tech-reports-search.php?number=2011-02}
}
Gove, R. (2011), "Understanding scientific literature networks: case study evaluations of integrating vizualizations and statistics". School: University of Maryland, Department of Computer Science.
Abstract: Investigators frequently need to quickly learn new research domains in order to advance their research. This thesis presents five contributions to understanding how software tools help researchers explore scientific literature networks. First, this thesis summarizes capabilities in existing bibliography tools, which reveals patterns of capabilities by system type. Next, six participants in two user studies evaluate Action Science Explorer (ASE), which is designed to create surveys of scientific literature and integrates visualizations and statistics. Users found document-level statistics and attribute rankings to be convenient when beginning literature exploration. The user studies also identify users? questions when exploring academic literature, which include examining the evolution of a field, identifying author relationships, and searching for review papers. The evaluations reveal some shortcomings of ASE, and this thesis outlines improvements to ASE and lists user requirements for bibliographic exploration. Finally, I recommend strategies for evaluating bibliographic exploration tools based on experiences evaluating ASE.
BibTeX:
@mastersthesis{Gove11Understandingscientificliterature,
  author = {Robert Gove},
  title = {Understanding scientific literature networks: case study evaluations of integrating vizualizations and statistics},
  school = {University of Maryland, Department of Computer Science},
  year = {2011},
  url = {http://hdl.handle.net/1903/11764}
}
Dunne, C., Shneiderman, B., Dorr, B. & Klavans, J. (2010), "iOpener Workbench: tools for rapid understanding of scientific literature", In Proc. 27th Annual Human-Computer Interaction Lab Symposium. College Park, MD. May 2010.
BibTeX:
@inproceedings{Dunne10iOpenerWorkbench_tools,
  author = {Cody Dunne and Ben Shneiderman and Bonnie Dorr and Judith Klavans},
  title = {iOpener Workbench: tools for rapid understanding of scientific literature},
  booktitle = {Proc. 27th Annual Human-Computer Interaction Lab Symposium},
  year = {2010},
  url = {http://www.cs.umd.edu/hcil/about/events/symposium2010}
}
Posters
Dunne, C. (2011), "Interactive data visualization for rapid understanding of scientific literature", Poster at VAC '11: Visual Analytics Consortium Meeting. May, 2011.
Abstract: We developed Action Science Explorer (ASE), a tool designed to support users in rapidly generating easily consumable summaries of academic literature. ASE uses bibliometric lexical link mining to create a citation network for a field and context for each citation, automatic clustering and multi-document summarization techniques to extract key points, and potent network analysis and visualization tools to aid in the exploration task. These techniques provide several coordinated views of the underlying data.
BibTeX:
@misc{Dunne11Interactivedatavisualization,
  author = {Cody Dunne},
  title = {Interactive data visualization for rapid understanding of scientific literature},
  howpublished = {Poster at VAC '11: Visual Analytics Consortium Meeting},
  year = {2011},
  url = {http://vacommunity.org/VAC+Consortium+2011+Meeting}
}
Gove, R. (2011), "Action Science Explorer", Poster at 28th Annual Human-Computer Interaction Lab Symposium. May, 2011.
BibTeX:
@misc{Gove11ActionScienceExplorer,
  author = {Robert Gove},
  title = {Action Science Explorer},
  howpublished = {Poster at 28th Annual Human-Computer Interaction Lab Symposium},
  year = {2011},
  url = {http://www.cs.umd.edu/hcil/soh}
}
Presentations
Dunne, C. (2011), "Visual analytic tools for monitoring and understanding the emergence and evolution of innovations in science & technology", Talk at OECD-KNOWINNO workshop on measuring the use and impact of knowledge exchange mechanisms. November, 2011.
Abstract: The internet and other ICTs have had an important role in promoting the use of datamining tools for assembling, interlinking and analysing information from diverse sources. In this session we will explore how advanced data analytics tools can be used for identifying and measuring knowledge flows between different parties and to what extent they can complement more traditional data sources such as patents, publications and surveys.
BibTeX:
@misc{Dunne11Visualanalytictools,
  author = {Cody Dunne},
  title = {Visual analytic tools for monitoring and understanding the emergence and evolution of innovations in science & technology},
  howpublished = {Talk at OECD-KNOWINNO workshop on measuring the use and impact of knowledge exchange mechanisms},
  year = {2011},
  url = {http://www.oecd.org/sti/knowledge}
}
Dunne, C. (2011), "What researchers want", Talk at STM 3rd Master Class on Developing Leadership and Innovation. November, 2011.
BibTeX:
@misc{Dunne11Whatresearcherswant,
  author = {Cody Dunne},
  title = {What researchers want},
  howpublished = {Talk at STM 3rd Master Class on Developing Leadership and Innovation},
  year = {2011},
  url = {http://www.stm-assoc.org/events/3rd-master-class-usa-2011/}
}
Dunne, C. (2011), "Action Science Explorer: interactive data visualization for rapid understanding of scientific literature", Talk at STM Annual Spring Conference. April, 2011.
Abstract: We developed Action Science Explorer (ASE), a tool designed to support users in rapidly generating easily consumable summaries of academic literature. ASE uses bibliometric lexical link mining to create a citation network for a field and context for each citation, automatic clustering and multi-document summarization techniques to extract key points, and potent network analysis and visualization tools to aid in the exploration task. These techniques provide several coordinated views of the underlying data.
BibTeX:
@misc{Dunne11ActionScienceExplorer_a,
  author = {Cody Dunne},
  title = {Action Science Explorer: interactive data visualization for rapid understanding of scientific literature},
  howpublished = {Talk at STM Annual Spring Conference},
  year = {2011},
  url = {http://www.stm-assoc.org/events/stm-annual-spring-conference-2011/}
}
Shneiderman, B. (2011), "Information visualization: A transformative technology for ACS", Talk at American Chemical Society. April, 2011.
BibTeX:
@misc{Shneiderman11ACS,
  author = {Ben Shneiderman},
  title = {Information visualization: A transformative technology for ACS},
  howpublished = {Talk at American Chemical Society},
  year = {2011}
}
Shneiderman, B. (2011), "Information visualization for knowledge discovery", Talk at Emory University Goizueta, College of Business Seminar. March, 2011.
BibTeX:
@misc{Shneiderman11Emory,
  author = {Ben Shneiderman},
  title = {Information visualization for knowledge discovery},
  howpublished = {Talk at Emory University Goizueta, College of Business Seminar},
  year = {2011}
}
Shneiderman, B. (2011), "Social discovery in an information abundant world", Talk at National Federation for Advanced Information Systems: Miles Conrad Award Lecture. February, 2011.
BibTeX:
@misc{Shneiderman11NFAIS,
  author = {Ben Shneiderman},
  title = {Social discovery in an information abundant world},
  howpublished = {Talk at National Federation for Advanced Information Systems: Miles Conrad Award Lecture},
  year = {2011}
}
Shneiderman, B. (2011), "Information visualization for knowledge discovery", Talk at University of North Carolina--Charlotte, Dept of Computer Science. April, 2011.
BibTeX:
@misc{Shneiderman11UNC,
  author = {Ben Shneiderman},
  title = {Information visualization for knowledge discovery},
  howpublished = {Talk at University of North Carolina--Charlotte, Dept of Computer Science},
  year = {2011},
  url = {http://cci.uncc.edu/?q=events/distinguished-lecture-series-professor-ben-shneiderman}
}
Shneiderman, B. (2011), "Success stories in visual analytics", Panel at VAC '11: Visual Analytics Consortium Meeting. May, 2011.
BibTeX:
@misc{Shneiderman11VAC,
  author = {Ben Shneiderman},
  title = {Success stories in visual analytics},
  howpublished = {Panel at VAC '11: Visual Analytics Consortium Meeting},
  year = {2011},
  url = {http://vacommunity.org/VAC+Consortium+2011+Meeting}
}
Shneiderman, B. (2011), "Information visualization for knowledge discovery", Talk at Yale University, Dept of Computer Science. April, 2011.
BibTeX:
@misc{Shneiderman11Yale,
  author = {Ben Shneiderman},
  title = {Information visualization for knowledge discovery},
  howpublished = {Talk at Yale University, Dept of Computer Science},
  year = {2011},
  url = {http://www.cs.yale.edu/calendars/shneiderman.html}
}
Shneiderman, B. (2011), "Information visualization for knowledge discovery", Talk at Georgia Tech Graphics, Visualization & Usability Brown Bag Lunch Seminar. March, 2011.
BibTeX:
@misc{Shniederman11GaTech,
  author = {Ben Shneiderman},
  title = {Information visualization for knowledge discovery},
  howpublished = {Talk at Georgia Tech Graphics, Visualization & Usability Brown Bag Lunch Seminar},
  year = {2011},
  url = {http://www.gvu.gatech.edu/node/4785}
}
Shneiderman, B. (2010), "Information visualization for knowledge discovery", Talk at IBM Research Center. October, 2010.
BibTeX:
@misc{Shneiderman10IBM,
  author = {Ben Shneiderman},
  title = {Information visualization for knowledge discovery},
  howpublished = {Talk at IBM Research Center},
  year = {2010}
}
Shneiderman, B. (2010), "Information visualization for knowledge discovery", Talk at Wellesley College, Dept of Computer Science. October, 2010.
BibTeX:
@misc{Shneiderman10Wellesley,
  author = {Ben Shneiderman},
  title = {Information visualization for knowledge discovery},
  howpublished = {Talk at Wellesley College, Dept of Computer Science},
  year = {2010}
}
Shneiderman, B. (2010), "Distinguished lecture in computational science: Information visualization for knowledge discovery", Talk at Harvard University, School of Engineering and Applied Sciences. October, 2010.
BibTeX:
@misc{Shniederman10Harvard,
  author = {Ben Shneiderman},
  title = {Distinguished lecture in computational science: Information visualization for knowledge discovery},
  howpublished = {Talk at Harvard University, School of Engineering and Applied Sciences},
  year = {2010},
  url = {http://www.seas.harvard.edu/news-events/calendars/computer_science/distinguished-lecture-in-computational-science-ben-shneiderman-university-of-maryland}
}

Related Public Tools

Other NSF Grant Publications

QuickSearch:   Number of matching entries: 0.

Search Settings

Journal Papers
Madnani, N. & Dorr, B.J. (2010), "Generating phrasal and sentential paraphrases: A survey of data-driven methods", CL: Computational Linguistics. Vol. 36(3), pp. 341-387.
Abstract: The task of paraphrasing is inherently familiar to speakers of all languages. Moreover, the task of automatically generating or extracting semantic equivalences for the various units of language?words, phrases, and sentences?is an important part of natural language processing (NLP) and is being increasingly employed to improve the performance of several NLP applications. In this article, we attempt to conduct a comprehensive and application-independent survey of data-driven phrasal and sentential paraphrase generation methods, while also conveying an appreciation for the importance and potential use of paraphrases in the field of NLP research. Recent work done in manual and automatic construction of paraphrase corpora is also examined. We also discuss the strategies used for evaluating paraphrase generation techniques and briefly explore some future trends in paraphrase generation.
BibTeX:
@article{Madnani10Generatingphrasaland,
  author = {Nitin Madnani and Bonnie~J. Dorr},
  title = {Generating phrasal and sentential paraphrases: A survey of data-driven methods},
  journal = {CL: Computational Linguistics},
  year = {2010},
  volume = {36},
  number = {3},
  pages = {341--387},
  url = {http://www.mitpressjournals.org/doi/abs/10.1162/coli_a_00002},
  doi = {http://dx.doi.org/10.1162/coli_a_00002}
}
Aris, A., Shneiderman, B., Qazvinian, V. & Radev, D. (2009), "Visual overviews for discovering key papers and influences across research fronts", JASIST: Journal of the American Society for Information Science and Technology. Vol. 60(11), pp. 2219-2228.
Abstract: Gaining a rapid overview of an emerging scientific topic, sometimes called research fronts, is an increasingly common task due to the growing amount of interdisciplinary collaboration. Visual overviews that show temporal patterns of paper publication and citation links among papers can help researchers and analysts to see the rate of growth of topics, identify key papers, and understand influences across subdisciplines. This article applies a novel network-visualization tool based on meaningful layouts of nodes to present research fronts and show citation links that indicate influences across research fronts. To demonstrate the value of two-dimensional layouts with multiple regions and user control of link visibility, we conducted a design-oriented, preliminary case study with 6 domain experts over a 4-month period. The main benefits were being able (a) to easily identify key papers and see the increasing number of papers within a research front, and (b) to quickly see the strength and direction of influence across related research fronts.
BibTeX:
@article{Aris09Visualoverviewsdiscovering,
  author = {Aleks Aris and Ben Shneiderman and Vahed Qazvinian and Dragomir Radev},
  title = {Visual overviews for discovering key papers and influences across research fronts},
  journal = {JASIST: Journal of the American Society for Information Science and Technology},
  year = {2009},
  volume = {60},
  number = {11},
  pages = {2219--2228},
  url = {http://www3.interscience.wiley.com/journal/122499081/abstract},
  doi = {http://dx.doi.org/10.1002/asi.21160}
}
Radev, D.R., Joseph, M.T., Gibson, B. & Muthukrishnan, P. (2009), "A bibliometric and network analysis of the field of computational linguistics", JASIST: Journal of the American Society for Information Science and Technology. John Wiley & Sons.
Abstract: The ACL Anthology is a large collection of research papers in computational linguistics. Citation data was obtained using text extraction from a collection of PDF files with significant manual post-processing performed to clean up the results. Manual annotation of the references was then performed to complete the citation network. We analyzed the networks of paper citations, author citations, and author collaborations in an attempt to identify the most central papers and authors. Also, we propose an improved method for comparing different measures of impact based on correlation. The analysis includes general network statistics, PageRank, metrics across publication years and venues, impact factor and h-index, as well as other measures.
BibTeX:
@article{Radev09bibliometricandnetwork,
  author = {Dragomir R. Radev and Mark Thomas Joseph and Bryan Gibson and Pradeep Muthukrishnan},
  title = {A bibliometric and network analysis of the field of computational linguistics},
  journal = {JASIST: Journal of the American Society for Information Science and Technology},
  publisher = {John Wiley & Sons},
  year = {2009},
  note = {To appear.},
  url = {http://clair.si.umich.edu/~radev/papers/biblio.pdf}
}
Elkiss, A., Shen, S., Fader, A., Erkan, G., States, D. & Radev, D.R. (2008), "Blind men and elephants: What do citation summaries tell us about a research article?", JASIST: Journal of the American Society for Information Science and Technology. Vol. 59(1), pp. 51-62. Wiley Subscription Services, Inc., A Wiley Company.
Abstract: The old Asian legend about the blind men and the elephant comes to mind when looking at how different authors of scientific papers describe a piece of related prior work. It turns out that different citations to the same paper often focus on different aspects of that paper and that neither provides a full description of its full set of contributions. In this article, we will describe our investigation of this phenomenon. We studied citation summaries in the context of research papers in the biomedical domain. A citation summary is the set of citing sentences for a given article and can be used as a surrogate for the actual article in a variety of scenarios. It contains information that was deemed by peers to be important. Our study shows that citation summaries overlap to some extent with the abstracts of the papers and that they also differ from them in that they focus on different aspects of these papers than do the abstracts. In addition to this, co-cited articles (which are pairs of articles cited by another article) tend to be similar. We show results based on a lexical similarity metric called cohesion to justify our claims.
BibTeX:
@article{Elkiss08Blindmenand,
  author = {Aaron Elkiss and Siwei Shen and Anthony Fader and Güneş Erkan and David States and Dragomir~R. Radev},
  title = {Blind men and elephants: What do citation summaries tell us about a research article?},
  journal = {JASIST: Journal of the American Society for Information Science and Technology},
  publisher = {Wiley Subscription Services, Inc., A Wiley Company},
  year = {2008},
  volume = {59},
  number = {1},
  pages = {51--62},
  url = {http://onlinelibrary.wiley.com/doi/10.1002/asi.20707/abstract},
  doi = {http://dx.doi.org/10.1002/asi.20707}
}
Books and Other One-Time Publications
Abu-Jbara, A. & Radev, D. (2011), "Coherent citation-based summarization of scientific papers", In HLT '11: Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. June 2011., pp. 500-509. Association for Computational Linguistics.
Abstract: In citation-based summarization, text written by several researchers is leveraged to identify the important aspects of a target paper. Previous work on this problem focused almost exclusively on its extraction aspect (i.e. selecting a representative set of citation sentences that highlight the contribution of the target paper). Meanwhile, the fluency of the produced summaries has been mostly ignored. For example, diversity, readability, cohesion, and ordering of the sentences included in the summary have not been thoroughly considered. This resulted in noisy and confusing summaries. In this work, we present an approach for producing readable and cohesive citation-based summaries. Our experiments show that the proposed approach outperforms several baselines in terms of both extraction quality and fluency.
BibTeX:
@inproceedings{Abu-Jbara11Coherentcitation-basedsummarization,
  author = {Amjad Abu-Jbara and Dragomir Radev},
  title = {Coherent citation-based summarization of scientific papers},
  booktitle = {HLT '11: Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
  publisher = {Association for Computational Linguistics},
  year = {2011},
  pages = {500--509},
  url = {http://www.aclweb.org/anthology/P11-1051}
}
Hu, Y., Boyd-Graber, J. & Satinoff, B. (2011), "Interactive topic modeling", In HLT '11: Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. June 2011., pp. 248-257. Association for Computational Linguistics.
Abstract: Topic models have been used extensively as a tool for corpus exploration, and a cottage industry has developed to tweak topic models to better encode human intuitions or to better model data. However, creating such extensions requires expertise in machine learning unavailable to potential end-users of topic modeling software. In this work, we develop a framework for allowing users to iteratively refine the topics discovered by models such as latent Dirichlet allocation (LDA) by adding constraints that enforce that sets of words must appear together in the same topic. We incorporate these constraints interactively by selectively removing elements in the state of a Markov Chain used for inference; we investigate a variety of methods for incorporating this information and demonstrate that these interactively added constraints improve topic usefulness for simulated and actual user sessions.
BibTeX:
@inproceedings{Hu11Interactivetopicmodeling,
  author = {Yuening Hu and Jordan Boyd-Graber and Brianna Satinoff},
  title = {Interactive topic modeling},
  booktitle = {HLT '11: Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
  publisher = {Association for Computational Linguistics},
  year = {2011},
  pages = {248--257},
  url = {http://www.aclweb.org/anthology/P11-1026}
}
Muthukrishnan, P., Radev, D. & Mei, Q. (2011), "Simultaneous similarity learning and feature-weight learning for document clustering", In HLT-TextGraphs '11: Proc. TextGraphs-6 workshop on Graph-based Methods for Natural Language Processing. Portland, OR. June 2011., pp. 42-50. Association for Computational Linguistics.
Abstract: A key problem in document classification and clustering is learning the similarity between documents. Traditional approaches include estimating similarity between feature vectors of documents where the vectors are computed using TF-IDF in the bag-of-words model. However, these approaches do not work well when either similar documents do not use the same vocabulary or the feature vectors are not estimated correctly. In this paper, we represent documents and keywords using multiple layers of connected graphs. We pose the problem of simultaneously learning similarity between documents and keyword weights as an edge-weight regularization problem over the different layers of graphs. Unlike most feature weight learning algorithms, we propose an unsupervised algorithm in the proposed framework to simultaneously optimize similarity and the keyword weights. We extrinsically evaluate the performance of the proposed similarity measure on two different tasks, clustering and classification. The proposed similarity measure outperforms the similarity measure proposed by (Muthukrishnan et al., 2010), a state-of-the-art classification algorithm (Zhou and Burges, 2007) and three different baselines on a variety of standard, large data sets.
BibTeX:
@inproceedings{Muthukrishnan11Simultaneoussimilaritylearning,
  author = {Pradeep Muthukrishnan and Dragomir Radev and Qiaozhu Mei},
  title = {Simultaneous similarity learning and feature-weight learning for document clustering},
  booktitle = {HLT-TextGraphs '11: Proc. TextGraphs-6 workshop on Graph-based Methods for Natural Language Processing},
  publisher = {Association for Computational Linguistics},
  year = {2011},
  pages = {42--50},
  url = {http://www.aclweb.org/anthology-new/W/W11/W11-1107.pdf}
}
Qazvinian, V. & Radev, D.R. (2011), "Exploiting phase transition in similarity networks for clustering", In AAAI '11: Proc. 25th Conference on Artificial Intelligence. August 2011.
Abstract: In this paper, we model the pair-wise similarities of a set of documents as a weighted network with a single cutoff parameter. Such a network can be thought of an ensemble of unweighted graphs, each consisting of edges with weights greater than the cutoff value. We look at this network ensemble as a complex system with a temperature parameter, and refer to it as a Latent Network. Our experiments on a number of datasets from two different domains show that certain properties of latent networks like clustering coef?cient, average shortest path, and connected components exhibit patterns that are signi?cantly divergent from randomized networks. We explain that these patterns re?ect the network phase transition as well as the existence of a community structure in document collections. Using numerical analysis, we show that we can use the aforementioned network properties to predicts the clustering Normalized Mutual Information (NMI) with high correlation ( > 0:9). Finally we show that our clustering method signi?cantly outperforms other baseline methods (NMI > 0:5)
BibTeX:
@inproceedings{Qazvinian11Exploitingphasetransition,
  author = {Vahed Qazvinian and Dragomir~R. Radev},
  title = {Exploiting phase transition in similarity networks for clustering},
  booktitle = {AAAI '11: Proc. 25th Conference on Artificial Intelligence},
  year = {2011},
  url = {http://www-personal.umich.edu/~vahed/papers/latent.pdf}
}
Qazvinian, V. & Radev, D.R. (2011), "Learning from human collective behavior to introduce diversity in summary generation", In HLT '11: Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. June 2011., pp. 1098-1108. Association for Computational Linguistics.
Abstract: We analyze collective discourse, a collective human behavior in content generation, and show that it exhibits diversity, a property of general collective systems. Using extensive analysis, we propose a novel paradigm for designing summary generation systems that reflect the diversity of perspectives seen in reallife collective summarization. We analyze 50 sets of summaries written by human about the same story or artifact and investigate the diversity of perspectives across these summaries. We show how different summaries use various phrasal information units (i.e., nuggets) to express the same atomic semantic units, called factoids. Finally, we present a ranker that employs distributional similarities to build a network of words, and captures the diversity of perspectives by detecting communities in this network. Our experiments show how our system outperforms a wide range of other document ranking systems that leverage diversity.
BibTeX:
@inproceedings{Qazvinian11Learningfromhuman,
  author = {Qazvinian, Vahed and Radev, Dragomir R.},
  title = {Learning from human collective behavior to introduce diversity in summary generation},
  booktitle = {HLT '11: Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
  publisher = {Association for Computational Linguistics},
  year = {2011},
  pages = {1098--1108},
  url = {http://www.aclweb.org/anthology-new/P/P11/P11-1110.pdf}
}
Satinoff, B. & Boyd-Graber, J. (2011), "Trivial classification: What features do humans use for classification?", In Workshop on Crowdsourcing Technologies for Language and Cognition Studies.
BibTeX:
@inproceedings{Satinoff11Trivialclassification_What,
  author = {Brianna Satinoff and Jordan Boyd-Graber},
  title = {Trivial classification: What features do humans use for classification?},
  booktitle = {Workshop on Crowdsourcing Technologies for Language and Cognition Studies},
  year = {2011},
  url = {http://www.crowdscientist.com/wp-content/uploads/2011/06/Boyd-Graber.pdf}
}
Whidby, M., Zajic, D. & Dorr, B.J. (2011), "Citation handling for improved summarization of scientific documents". University of Maryland, Technical Report LAMP-TR-157, 2011.
Abstract: In this paper we present the first steps toward improving summarization of scientific documents through citation analysis and parsing. Prior work (Mohammad et al., 2009) argues that citation texts (sentences that cite other papers) play a crucial role in automatic summarization of a topical area, but did not take into account the noise introduced by the citations themselves. We demonstrate that it is possible to improve summarization output through careful handling of these citations. We base our experiments on the application of an improved trimming approach to summarization of citation texts extracted from Question-Answering and Dependency-Parsing documents. We demonstrate that confidence scores from the Stanford NLP Parser (Klein and Manning, 2003) are significantly improved, and that Trimmer (Zajic et al., 2007), a sentence-compression tool, is able to generate higher-quality candidates. Our summarization output is currently used as part of a larger system, Action Science Explorer (ASE) (Gove, 2011).
BibTeX:
@techreport{Whidby11Citationhandlingimproved,
  author = {Michael Whidby and David Zajic and Bonnie~J. Dorr},
  title = {Citation handling for improved summarization of scientific documents},
  year = {2011},
  number = {LAMP-TR-157},
  url = {http://hdl.handle.net/1903/11822}
}
Lin, J., Madnani, N. & Dorr, B. (2010), "Putting the user in the loop: Interactive maximal marginal relevance for query-focused summarization", In HLT '10: Proc. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. June 2010., pp. 305-308. Association for Computational Linguistics.
Abstract: This work represents an initial attempt to move beyond "single-shot" summarization to interactive summarization. We present an extension to the classic Maximal Marginal Relevance (MMR) algorithm that places a user "in the loop" to assist in candidate selection. Experiments in the complex interactive Question Answering (ciQA) task at TREC 2007 show that interactively-constructed responses are significantly higher in quality than automatically-generated ones. This novel algorithm provides a starting point for future work on interactive summarization.
BibTeX:
@inproceedings{Lin10Puttinguserin,
  author = {Jimmy Lin and Nitin Madnani and Bonnie Dorr},
  title = {Putting the user in the loop: Interactive maximal marginal relevance for query-focused summarization},
  booktitle = {HLT '10: Proc. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  publisher = {Association for Computational Linguistics},
  year = {2010},
  pages = {305--308},
  url = {http://www.aclweb.org/anthology-new/N/N10/N10-1041.pdf}
}
Qazvinian, V., Radev, D.R. & Özgür, A. (2010), "Citation summarization through keyphrase extraction", In COLING '10: Proc. 23rd International Conference on Computational LInguistics. August 2010., pp. 895-903.
Abstract: This paper presents an approach to summarize single scientific papers, by extracting its contributions from the set of citation sentences written in other papers. Our methodology is based on extracting significant keyphrases from the set of citation sentences and using these keyphrases to build the summary. Comparisons show how this methodology excels at the task of single paper summarization, and how it out-performs other multi-document summarization methods.
BibTeX:
@inproceedings{Qazvinian10CitationSummarizationThrough,
  author = {Vahed Qazvinian and Dragomir~R. Radev and Arzucan Özgür},
  title = {Citation summarization through keyphrase extraction},
  booktitle = {COLING '10: Proc. 23rd International Conference on Computational LInguistics},
  year = {2010},
  pages = {895--903},
  url = {http://aclweb.org/anthology/C/C10/C10-1101.pdf}
}
Qazvinian, V. & Radev, D.R. (2010), "Identifying non-explicit citing sentences for citation-based summarization", In ACL '10: Proc. 48th Annual Meeting of the Association for Computational Linguistics. July 2010., pp. 555-564.
Abstract: Identifying background (context) information in scientific articles can help scholars understand major contributions in their research area more easily. In this paper, we propose a general framework based on probabilistic inference to extract such context information from scientific papers. We model the sentences in an article and their lexical similarities as a Markov Random Field tuned to detect the patterns that context data create, and employ a Belief Propagation mechanism to detect likely context sentences. We also address the problem of generating surveys of scientific papers. Our experiments show greater pyramid scores for surveys generated using such context information rather than citation sentences alone.
BibTeX:
@inproceedings{Qazvinian10Identifyingnon-explicitciting,
  author = {Vahed Qazvinian and Dragomir~R. Radev},
  title = {Identifying non-explicit citing sentences for citation-based summarization},
  booktitle = {ACL '10: Proc. 48th Annual Meeting of the Association for Computational Linguistics},
  year = {2010},
  pages = {555--564},
  url = {http://www.aclweb.org/anthology-new/P/P10/P10-1057.pdf}
}
Mohammad, S., Dunne, C. & Dorr, B. (2009), "Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus", In EMNLP '09: Proc. 2009 conference on Empirical Methods in Natural Language Processing. Morristown, NJ, USA. August 2009., pp. 599-608. Association for Computational Linguistics.
Abstract: Sentiment analysis often relies on a semantic orientation lexicon of positive and negative words. A number of approaches have been proposed for creating such lexicons, but they tend to be computationally expensive, and usually rely on significant manual annotation and large corpora. Most of these methods use WordNet. In contrast, we propose a simple approach to generate a high-coverage semantic orientation lexicon, which includes both individual words and multi-word expressions, using only a Roget-like thesaurus and a handful of affixes. Further, the lexicon has properties that support the Polyanna Hypothesis. Using the General Inquirer as gold standard, we show that our lexicon has 14 percentage points more correct entries than the leading WordNet-based high-coverage lexicon (SentiWordNet). In an extrinsic evaluation, we obtain significantly higher performance in determining phrase polarity using our thesaurus-based lexicon than with any other. Additionally, we explore the use of visualization techniques to gain insight into the our algorithm beyond the evaluations mentioned above.
BibTeX:
@inproceedings{Mohammad09Generatinghigh-coveragesemantic,
  author = {Saif Mohammad and Cody Dunne and Bonnie Dorr},
  title = {Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus},
  booktitle = {EMNLP '09: Proc. 2009 conference on Empirical Methods in Natural Language Processing},
  publisher = {Association for Computational Linguistics},
  year = {2009},
  pages = {599-608},
  url = {http://portal.acm.org/citation.cfm?id=1699571.1699591},
  doi = {http://dx.doi.org/10.1145/1699571.1699591}
}
Mohammad, S., Dorr, B., Egan, M., Hassan, A., Muthukrishan, P., Qazvinian, V., Radev, D. & Zajic, D. (2009), "Using citations to generate surveys of scientific paradigms", In HLT/NAACL '09: Proc. Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg, PA, USA., pp. 584-592. Association for Computational Linguistics.
Abstract: The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of technical material. In this paper we present the first steps in producing an automatically generated, readily consumable, technical survey. Specifically we explore the combination of citation information and summarization techniques. Even though prior work (Teufel et al., 2006) argues that citation text is unsuitable for summarization, we show that in the framework of multi-document survey creation, citation texts can play a crucial role.
BibTeX:
@inproceedings{Mohammad09Usingcitationsto,
  author = {Saif Mohammad and Bonnie Dorr and Melissa Egan and Ahmed Hassan and Pradeep Muthukrishan and Vahed Qazvinian and Dragomir Radev and David Zajic},
  title = {Using citations to generate surveys of scientific paradigms},
  booktitle = {HLT/NAACL '09: Proc. Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  publisher = {Association for Computational Linguistics},
  year = {2009},
  pages = {584--592},
  url = {http://portal.acm.org/citation.cfm?id=1620754.1620839},
  doi = {http://dx.doi.org/10.3115/1620754.1620839}
}
Qazvinian, V. & Radev, D.R. (2009), "The evolution of scientific title networks", In ICWSM '09: Proc. 2009 International AAAI Conference on Weblogs and Social Media poster session.
Abstract: In spite of enormous previous efforts to model the growth of various networks, there have only been a few works that successfully describe the evolution of latent networks. In a latent network edges do not represent interactions between nodes, but show some proximity values. In this paper we analyze the structure and evolution of a specific type of latent networks over time by looking at a wide range of document similarity networks, in which scientific titles are nodes and their similarities are weighted edges. We use scientific papers as the corpora in order to determine the behavior of authors in choosing words for article titles. The aim of our work is to see whether term selection for titles depends on earlier published titles.
BibTeX:
@inproceedings{Qazvinian09evolutionofscientific,
  author = {Vahed Qazvinian and Dragomir~R. Radev},
  title = {The evolution of scientific title networks},
  booktitle = {ICWSM '09: Proc. 2009 International AAAI Conference on Weblogs and Social Media poster session},
  year = {2009},
  url = {http://tangra.si.umich.edu/clair/iopener/pdf/icwsm.pdf}
}
Radev, D.R., Muthukrishnan, P. & Qazvinian, V. (2009), "The ACL Anthology Network corpus", In NLPIR4DL '09: Proc. ACL-IJCNLP 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries. Stroudsburg, PA, USA., pp. 54-61. Association for Computational Linguistics.
Abstract: We introduce the ACL Anthology Network (AAN), a manually curated networked database of citations, collaborations, and summaries in the field of Computational Linguistics. We also present a number of statistics about the network including the most cited authors, the most central collaborators, as well as network statistics about the paper citation, author citation, and author collaboration networks.
BibTeX:
@inproceedings{Radev09ACLAnthologyNetwork,
  author = {Dragomir~R. Radev and Pradeep Muthukrishnan and Vahed Qazvinian},
  title = {The ACL Anthology Network corpus},
  booktitle = {NLPIR4DL '09: Proc. ACL-IJCNLP 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries},
  publisher = {Association for Computational Linguistics},
  year = {2009},
  pages = {54--61},
  url = {http://portal.acm.org/citation.cfm?id=1699750.1699759},
  doi = {http://dx.doi.org/10.3115/1699750.1699759}
}
Aris, A. (2008), "Visualizing and exploring networks using Semantic Substrates". School: University of Maryland, Department of Computer Science.
Abstract: Visualizing and exploring network data has been a challenging problem for HCI (Human-Computer Interaction) Information Visualization researchers due to the complexity of representing networks (graphs). Research in this area has concentrated on improving the visual organization of nodes and links according to graph drawing aesthetics criteria, such as minimizing link crossings and the longest link length. Semantic substrates offer a different approach by which node locations represent node attributes. Users define semantic substrates for a given dataset according to the dataset characteristics and the questions, needs, and tasks of users. The substrates are typically 2-5 non-overlapping rectangular regions that meaningfully lay out the nodes of the network, based on the node attributes. Link visibility filters are provided to enable users to limit link visibility to those within or across regions. The reduced clutter and visibility of only selected links are designed to help users find meaningful relationships. This dissertation presents 5 detailed case studies (3 long-term and 2 short-term) that report on sessions with professional users working on their own datasets using successive versions of the NVSS (Network Visualization by Semantic Substrates, http://www.cs.umd.edu/hcil/nvss) software tool. Applications include legal precedent (with court cases citing one another), food-web (predator-prey relationships) data, scholarly paper citations, and U. S. Senate voting patterns. These case studies, which had networks of up to 4,296 nodes and 16,385 links, helped refine NVSS and the semantic substrate approach, as well as understand its limitations. The case study approach enabled users to gain insights and form hypotheses about their data, while providing guidance for NVSS revisions. The proposed guidelines for semantic substrate definitions are potentially applicable to other datasets such as social networks, business networks, and email communication. NVSS appears to be an effective tool because it offers a user-controlled and understandable method of exploring networks. The main contributions of this dissertation include the extensive exploration of semantic substrates, implementation of software to define substrates, guidelines to design good substrates, and case studies to illustrate the applicability of the approach to various domains and its benefits.
BibTeX:
@phdthesis{Aris08Visualizingandexploring,
  author = {Aleks Aris},
  title = {Visualizing and exploring networks using Semantic Substrates},
  school = {University of Maryland, Department of Computer Science},
  year = {2008},
  url = {http://hdl.handle.net/1903/8619}
}
Bird, S., Dale, R., Dorr, B.J., Gibson, B., Joseph, M., Kan, M.-Y., Lee, D., Powley, B., Radev, D. & Tan, Y.F. (2008), "The ACL Anthology Reference Corpus: A reference dataset for bibliographic research in computational linguistics", In LREC '08: Proc. Sixth International Language Resources and Evaluation. May 2008., pp. 1755-1759. European Language Resources Association (ELRA).
BibTeX:
@inproceedings{Bird08ACLAnthologyReference,
  author = {Steven Bird and Robert Dale and Bonnie J. Dorr and Bryan Gibson and Mark Joseph and Min-Yen Kan and Dongwon Lee and Brett Powley and Dragomir Radev and Yee Fan Tan},
  title = {The ACL Anthology Reference Corpus: A reference dataset for bibliographic research in computational linguistics},
  booktitle = {LREC '08: Proc. Sixth International Language Resources and Evaluation},
  publisher = {European Language Resources Association (ELRA)},
  year = {2008},
  pages = {1755--1759},
  url = {http://www.lrec-conf.org/proceedings/lrec2008/pdf/445_paper.pdf}
}
Dorr, B., Mohammad, S. & Onyshkevych, B. (2008), "From linguistic annotations to knowledge objects", In SKDOU '08: Proc. Symposium on Semantic Knowledge Discovery, Organizaiton and Use. November 2008.
BibTeX:
@inproceedings{Dorr08Fromlinguisticannotations,
  author = {Bonnie Dorr and Saif Mohammad and Boyan Onyshkevych},
  title = {From linguistic annotations to knowledge objects},
  booktitle = {SKDOU '08: Proc. Symposium on Semantic Knowledge Discovery, Organizaiton and Use},
  year = {2008},
  url = {http://nlp.cs.nyu.edu/sk-symposium/program.html}
}
Klavans, Judith, Shneiderman, Ben & others (2008), "Motivating interactive summarizations: User guided exploration of new domains"
BibTeX:
@unpublished{Klavans08Motivatinginteractivesummarizations_,
  author = {Klavans and Judith and Shneiderman and Ben and others},
  title = {Motivating interactive summarizations: User guided exploration of new domains},
  year = {2008},
  note = {Draft manuscript}
}
Mohammad, S., Dorr, B.J. & Hirst, G. (2008), "Computing word-pair antonymy", In EMNLP '08: Proc. 2008 conference on Empirical Methods in Natural Language Processing. October 2008., pp. 982-991. Association for Computational Linguistics.
Abstract: Knowing the degree of antonymy between words has widespread applications in natural language processing. Manually-created lexicons have limited coverage and do not include most semantically contrasting word pairs. We present a new automatic and empirical measure of antonymy that combines corpus statistics with the structure of a published thesaurus. The approach is evaluated on a set of closest-opposite questions, obtaining a precision of over 80%. Along the way, we discuss what humans consider antonymous and how antonymy manifests itself in utterances.
BibTeX:
@inproceedings{Mohammad08Computingword-pairantonymy,
  author = {Saif Mohammad and Bonnie~J. Dorr and Graeme Hirst},
  title = {Computing word-pair antonymy},
  booktitle = {EMNLP '08: Proc. 2008 conference on Empirical Methods in Natural Language Processing},
  publisher = {Association for Computational Linguistics},
  year = {2008},
  pages = {982--991},
  url = {http://www.aclweb.org/anthology-new/D/D08/D08-1103.pdf}
}
Mohammad, S., Dorr, B., Egan, M., Madnani, N., Zajic, D. & Lin, J. (2008), "Multiple alternative sentence compressions and word-pair antonymy for automatic text summarization and recognizing textual entailment", In TAC '08: Text Analysis Conference. November 2008.
Abstract: The University of Maryland participated in three tasks organized by the Text Analysis Conference 2008 (TAC 2008): (1) the update task of text summarization; (2) the opinion task of text summarization; and (3) recognizing textual entailment (RTE). At the heart of our summarization system is Trimmer, which generates multiple alternative compressed versions of the source sentences that act as candidate sentences for inclusion in the summary. For the ?rst time, we investigated the use of automatically generated antonym pairs for both text summarization and recognizing textual entailment. The UMD summaries for the opinion task were especially effective in providing non-redundant information (rank 3 out of a total 19 submissions). More coherent summaries resulted when using the antonymy feature as compared to when not using it. On the RTE task, even when using only automatically generated antonyms the system performed as well as when using a manually compiled list of antonyms.
BibTeX:
@inproceedings{Mohammad08Multiplealternativesentence,
  author = {Saif Mohammad and Bonnie Dorr and Melissa Egan and Nitin Madnani and David Zajic and Jimmy Lin},
  title = {Multiple alternative sentence compressions and word-pair antonymy for automatic text summarization and recognizing textual entailment},
  booktitle = {TAC '08: Text Analysis Conference},
  year = {2008},
  url = {http://www.nist.gov/tac/publications/2008/participant.papers/UMD.proceedings.pdf}
}
Mohammad, S., Dorr, B. & Hirst, G. (2008), "Towards antonymy-aware natural language applications", In SKDOU '08: Proc. Symposium on Semantic Knowledge Discovery, Organizaiton and Use. November 2008.
BibTeX:
@inproceedings{Mohammad08Towardsantonymy-awarenatural,
  author = {Saif Mohammad and Bonnie Dorr and Graeme Hirst},
  title = {Towards antonymy-aware natural language applications},
  booktitle = {SKDOU '08: Proc. Symposium on Semantic Knowledge Discovery, Organizaiton and Use},
  year = {2008},
  url = {http://nlp.cs.nyu.edu/sk-symposium/program.html}
}
Muthukrishnan, P., Gerrish, J. & Radev, D.R. (2008), "Detecting multiple facets of an event using graph-based unsupervised methods", In COLING '08: Proc. 22nd International Conference on Computational Linguistics. August 2008., pp. 609-616.
Abstract: We propose a new unsupervised method for topic detection that automatically identifies the different facets of an event. We use pointwise Kullback-Leibler divergence along with the Jaccard coefficient to build a topic graph which represents the community structure of the different facets. The problem is formulated as a weighted set cover problem with dynamically varying weights. The algorithm is domain-independent and generates a representative set of informative and discriminative phrases that cover the entire event. We evaluate this algorithm on a large collection of blog postings about different news events and report promising results.
BibTeX:
@inproceedings{Muthukrishnan08Detectingmultiplefacets,
  author = {Pradeep Muthukrishnan and Joshua Gerrish and Dragomir~R. Radev},
  title = {Detecting multiple facets of an event using graph-based unsupervised methods},
  booktitle = {COLING '08: Proc. 22nd International Conference on Computational Linguistics},
  year = {2008},
  pages = {609-616},
  url = {http://www.aclweb.org/anthology-new/C/C08/C08-1077.pdf}
}
Qazvinian, V. & Radev, D.R. (2008), "Scientific paper summarization using citation summary networks", In COLING '08: Proc. 22nd International Conference on Computational Linguistics. Stroudsburg, PA, USA., pp. 689-696. Association for Computational Linguistics.
Abstract: Quickly moving to a new area of research is painful for researchers due to the vast amount of scientific literature in each field of study. One possible way to overcome this problem is to summarize a scientific topic. In this paper, we propose a model of summarizing a single article, which can be further used to summarize an entire topic. Our model is based on analyzing others' viewpoint of the target article's contributions and the study of its citation summary network using a clustering approach.
BibTeX:
@inproceedings{Qazvinian08Scientificpapersummarization,
  author = {Vahed Qazvinian and Dragomir~R. Radev},
  title = {Scientific paper summarization using citation summary networks},
  booktitle = {COLING '08: Proc. 22nd International Conference on Computational Linguistics},
  publisher = {Association for Computational Linguistics},
  year = {2008},
  pages = {689--696},
  url = {http://portal.acm.org/citation.cfm?id=1599081.1599168},
  doi = {http://dx.doi.org/10.3115/1599081.1599168}
}
Shneiderman, B. (2008), "Research agenda: Visual overviews for exploratory search", In National Science Foundation Workshop on Information Seeking Support Systems. June 2008.
Abstract: Exploratory search is necessary when users knowledge of the domain is incomplete or when initial user goals do not match available data or metadata that is the basis for search indexing attributes. Such mismatches mean that users need to learn more in order to develop a better understanding of the domain or to revise their search goals. Exploratory search processes may take weeks or months, so interfaces that support prolonged exploration are necessary. The attraction of exploratory search is that users can take on more ambitious goals that require substantial learning and creative leaps to bridge the gaps between what they know and that they seek.
BibTeX:
@inproceedings{Shneiderman08Researchagenda_Visual,
  author = {Ben Shneiderman},
  title = {Research agenda: Visual overviews for exploratory search},
  booktitle = {National Science Foundation Workshop on Information Seeking Support Systems},
  year = {2008},
  note = {Position paper, Published Bibliography},
  url = {http://ils.unc.edu/ISSS/papers/papers/shneiderman.pdf}
}
Shneiderman, Ben, Aris, Aleks & others (2008), "Visual summarization of topic evaluation and topic dependencies"
BibTeX:
@unpublished{Shneiderman08Visualsummarizationof,
  author = {Shneiderman and Ben and Aris and Aleks and others},
  title = {Visual summarization of topic evaluation and topic dependencies},
  year = {2008},
  note = {Draft manuscript}
}
Joseph, M.T. & Radev, D.R. (2007), "Citation analysis, centrality, and the ACL Anthology". University of Michigan. Department of Electrical Engineering and Computer Science, Technical Report CSE-TR-535-07, 2007.
Abstract: We analyze the ACL Anthology citation network in an attempt to identify the most ?central? papers and authors using graph-based methods. Citation data was obtained using text extraction from the library of PDF files with some post-processing performed to clean up the results. Manual annotation of the references was then performed to complete the citation network. The analysis compares metrics across publication years and venues, such as citations in and out. The most cited paper, central papers, and papers with the highest impact factor are also established.
BibTeX:
@techreport{Joseph07Citationanalysiscentrality,
  author = {Mark~T. Joseph and Dragomir~R. Radev},
  title = {Citation analysis, centrality, and the ACL Anthology},
  year = {2007},
  number = {CSE-TR-535-07},
  url = {http://www.eecs.umich.edu/techreports/cse/2007/CSE-TR-535-07.pdf}
}
Radev, D.R., Hodges, M., Fader, A., Joseph, M., Gerrish, J., Schaller, M., dePeri, J. & Gibson, B. (2007), "CLAIRLIB documentation v1.03". University of Michigan. Department of Electrical Engineering and Computer Science, Technical Report CSE-TR-536-07, 2007.
BibTeX:
@techreport{Radev07CLAIRLIBdocumentationv1.03,
  author = {Dragomir~R. Radev and Mark Hodges and Anthony Fader and Mark Joseph and Joshua Gerrish and Mark Schaller and Jonathan dePeri and Bryan Gibson},
  title = {CLAIRLIB documentation v1.03},
  year = {2007},
  number = {CSE-TR-536-07},
  url = {http://www.eecs.umich.edu/techreports/cse/2007/CSE-TR-536-07.pdf}
}