You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format. However, this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the Department of Computer Science of the University of Maryland at College Park under terms that include this permission. All other rights are reserved by the author(s).
Translating English and Mandarin Verbs with Argument Structure. Mari Broman Olsen. October 1998.
This paper applies and evaluates a semi-automatically acquired Mandarin Chinese lexicon (Olsen, Dorr, and Thomas 1998) with respect to translation of English and Chinese verbs in a UNESCO text (Otero 1997). I demonstrate how Lexical Conceptual Structure templates allow the same semantic structure to apply both to verbs with thematic roles incorporated in the verb itself, and those requiring external thematic complements. Using as examples the English verb _provide_, the Chinese counterpart ti2 gong2 (STC 2251 0180) and its English counterparts in the text, I show how potential translations are included or eliminated automatically based on their thematic role structure. The example illustrates (i) how an interlingual thematic representation based in large part on English argument structure may be adapted felicitously to a historically unrelated language, and (ii) how an interlingual (IL) resource developed for analysis may also be used in generation. (Also cross-refernced as UMIACS-TR-98-51) (Also cross-referenced as LAMP-TR-023) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Enhancing Automatic Acquisition of Thematic Structure in a Large-Scale. Mari Broman Olsen. Bonnie Dorr. Scott Thomas. June 1998.
This paper describes a refinement to our procedure for porting lexical conceptual structure into new languages. Specifically we describe a two-step process for creating candidate thematic grids for Mandarin Chinese verbs, using the English verb heading the VP in the subdefinitions to separate senses, and roughly parsing the verb complement structure to match to our thematic structure templates. The procedure is part of a larger process of creating a usable lexicon for interlingual machine translation from a large on-line resource with both too much and too little information necessary for our system. (Also cross-referenced as UMIACS-TR-98-35) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Toward Compact Monotonically Compositional Interlingua Using Lexical Aspect. Bonnie J. Dorr. Mari Broman Olsen. Scott C. Thomas. December 1997.
We describe a theoretical investigation into the semantic space described by our interlingua (IL), which currently has 191 main verb classes divided into 434 subclasses, represented by 237 distinct Lexical Conceptual Structures (LCSs). Using the model of aspect in Olsen (1994b, 1997a)---monotonic aspectual composition---we have identified 71 aspectually basic subclasses that are associated with one or more of 68 aspectually non-basic classes via some lexical (``type-shifting'') rule (Bresnan 1982, Pinker 1984, Levin and Rappaport Hovav 1995). This allows us to refine the IL and address certain computational and theoretical issues at the same time. (1) >From a linguistic viewpoint, the expected benefits include a refinement of the aspectual model in (Olsen:1994b, 1997a) (which provides necessary but not sufficient conditions for aspectual composition), and a refinement of the verb classifications in (Levin 1993); we also expect our approach to eventually produce a systematic definition (in terms of LCSs and compositional operations) of the precise meaning components responsible for Levin's classification. (2) Computationally, the lexicon is made more compact. Also cross-referenced as UMIACS-TR-97-86 Also cross-referenced as LAMP-TR-012 University of Maryland Institute for Advanced Computer Studies, University of Maryland Laboratory for Language and Media Processing, Department of Computer Science, University of Maryland,
Using WordNet to Posit Hierarchical Structure in Levin's Verb Classes. Mari Broman Olsen. Bonnie J. Dorr. David J. Clark. December 1997.
In this paper we report on experiments using WordNet synset tags to evaluate the semantic properties of the verb classes cataloged by Levin 1993. This paper represents ongoing research begun at the University of Pennsylvania (Rosenzweig et al. 1997, Palmer et al. 1997) and the University of Maryland (Dorr and Jones 1996b, 1996d, 1996e). Using WordNet sense tags to constrain the intersection of Levin classes, we avoid spurious class intersections introduced by homonymy and polysemy (_run a bath, run a mile_). By adding class intersections based on a single shared sense-tagged word, we minimize the impact of the non-exhaustiveness of Levin's database (Dorr and Olsen 1996, Dorr to appear). By examining the syntactic properties of the intersective classes, we provide a clearer picture of the relationship between WordNet/EuroWordNet and the LCS interlingua for machine translation and other NLP applications. Also cross-referenced as UMIACS-TR-97-85 Also cross-referenced as LAMP-TR-011 University of Maryland Institute for Advanced Computer Studies, University of Maryland Laboratory for Language and Media Processing, Department of Computer Science, University of Maryland,
Aspectual Modifications to a LCS Database for NLP Applications. Bonnie J. Dorr. Mari Broman Olsen. May 1997.
Verbal and compositional lexical aspect provide the underlying temporal structure of events. Knowledge of lexical aspect, e.g., (a)telicity, is therefore required for interpreting event sequences in discourse (Dowty, 1986: Moens and Steedman, 1988; Passoneau, 1988), interfacing to temporal databases (Androutsopoulos, 1996), processing temporal modifiers (Antonisse, 1994), describing allowable alternations and their semantic effects (Resnik, 1996; Tenny, 1994), and selecting tense and lexical items for natural language generation ((Dorr and Olsen, 1996; Klavans and Chodorow, 1992), cf. (Slobin and Bocaz, 1988)). We show that it is possible to represent lexical aspect---both verbal and compositional---on a large scale, using Lexical Conceptual Structure (LCS) representations of verbs in the classes cataloged by Levin (1993). We show how proper consideration of these universal pieces of verb meaning may be used to refine lexical representations and derive a range of meanings from combinations of LCS representations. A single algorithm may therefore be used to determine lexical aspect classes and features at both verbal and sentence levels. Finally, we illustrate how knowledge of lexical aspect facilitates the interpretation of events in NLP applications. (Also cross-referenced as UMIACS-TR-97-21) (Also cross-referenced as LAMP-TR-007) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Telicity and English Verb Classes and Alternations: An Overview. Mari Broman Olsen. February 1996.
This document reports on research conducted for the University of Maryland Machine Translation (MT) project. The primary focus of this investigation concerns the lexical aspect feature [+telic] (i.e., having an inherent end, as in the verb win, vs. the verb run) and its relation to the alternations outlined in (Levin, 1993), English verb classes and alternations. This work is based on the assumption that lexical aspect features need not be primitive but may be derived from the same semantic components that potentiate the alternations. Levin's 86 alternations and constructions are divided into five classes with respect to telicity: (i) alternations that indicate telicity (all participating verbs are [+telic] in their basic sense), (ii) alternations and constructions that add telicity (all participating verbs are [+telic] in the relevant construction), (iii) alternations that indicate atelicity (all participating verbs are [;telic] in their basic sense), (iv) alternations and constructions that are irrelevant with respect to (a)telicity (some participating verbs are [+telic] and others [;telic], and their categorization is not systematically affected by the relevant construction), and, for completeness, (v) a small number of alternations that cannot be classified. For alternations indicating telicity_category (i)_I examine the semantic components said to potentiate the alternations, and for alternations and constructions adding telicity_category (ii)_the semantic components added along with telicity. The results suggest a composite semantic basis for telicity, related to the notion of change of state (broadly defined), but not perfectly correlated with it. Other notions are also relevant, such as contextually typical degree, reciprocal action, and dynamicity, another lexical aspect feature. In addition, the study of categories (ii)-(iv) reveals that certain frames may be used for diagnosing atelicity, despite its generally variable behavior. This study also explores the relationship between transitivity and telicity, following suggestions in the work of Hopper and Thompson (1980), Tenny (1987; 1989; 1994), and van Hout (to appear), among others. (Also cross-referenced as UMIACS-TR-96-15) The research reported herein was supported, in part, by Army Research Office contract DAAL03-91-C-0034 through Battelle Corporation, NSF NYI IRI-9357731, Alfred P. Sloan Research Fellow Award BR3336, and a General Research Board Semester Award. University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Last Generated Fri Aug 11 04:01:01 EDT 2000