To see a listing with abstracts
To see a listing without abstracts
You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format. However, this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the Department of Computer Science of the University of Maryland at College Park under terms that include this permission. All other rights are reserved by the author(s).
On the communication-storage minimization for a class of secure. Radha Poovendran. July 2000.
Developing cryptographic key management protocols that have scalability in terms of the key storage as well as key update communication is an important problem in many secure multicast applications~\cite{rb1,dea,wgl}. Wong {\em et al.}~\cite{wgl} and Wallner {\em et al.}~\cite{dea} independently presented the first set of key distribution models where the key update communication grows as ${\cal O}(\log N)$ for group of size $N$. However, the storage requirement of these models were ${\cal O(N)$. Recently~\cite{cmn}, a new model based on clustering of the group members was proposed in order to lower the key storage while maintaining the update communication growth as ${\cal O}(\log N)$. For the new model, by considering the product of the storage and the communication as the cost function, the optimal cluster size $M$ was conjectured to be $M= {\cal O}(\log N)$. In this paper, we show that the optimal value of the cluster can be computed without the product function due the monotonicity of the storage with respect to the cluster size. We show that the optimal cluster size selection of the model in~\cite{cmn} can be formulated as a constraint optimization problem, and then transform it to a fixed point equation of the form $M - \lambda \log_e M = (\beta_2 - \lambda)\log_e N$, where $\beta_2, \lambda$ are model parameters. We first show that the largest root of this equation is the optimal solution, and then compute it by two different techniques. We then show that the first order approximation of the solution is of the form $M \approx (\beta_2 -\lambda)\log_e N + \lambda \log_e \log_e N$, leading to $M \approx (\beta_2 - \lambda) \log_e N$ for large values of $N$. We make a case for use of the estimate $M = (\beta_2 -\lambda) \log_e N + \lambda \log_e \log_e N$ instead of $M = \log_e N$ by showing that even for group size up to $2^{32}$, the value $M = \log_e N + \lambda \log_e \log_e N$ provides significantly lower value of key storage compared to the value $M = \log_e N$. We also show that the best estimate of $M$ using the product function in~\cite{cmn} does not exceed $M = \nu \log_e N$ for a constant $\nu$. (Also cross-referenced as UMIACS-TR-2000-58) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Performance and Analysis of Saddle Point Preconditioners for the. Howard C. Elman. David J. Silvester. Andrew J. Wathen. July 2000.
We examine the convergence characteristics of iterative methods based on a new preconditioning operator for solving the linear systems arising from discretization and linearization of the steady-state Navier-Stokes equations. With a combination of analytic and empirical results, we study the effects of fundamental parameters on convergence. We demonstrate that the preconditioned problem has an eigenvalue distribution consisting of a tightly clustered set together with a small number of outliers. The structure of these distributions is independent of the discretization mesh size, but the cardinality of the set of outliers increases slowly as the viscosity becomes smaller. These characteristics are directly correlated with the convergence properties of iterative solvers. (Also cross-refernced as UMIACS-TR-2000-54) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Rate Windows for Efficient Network and I/O Throttling. Kyung D. Ryu. Jeffrey K. Hollingsworth. Peter J. Keleher. July 2000.
This paper proposes and evaluates a new mechanism for I/O and network rate policing. The goal of the proposed system is to provide an simple, yet effective way to enforce resource limits on target classes of jobs in a system. The basic approach is useful for several types of systems including running background jobs on idle workstations, and providing resource limits on network intensive applications such as virtual web server hosting. Our approach is quite simple, we use a sliding window average of recent events to compute the average rate for a target resource. The assigned limit is enforced by forcing application processes to sleep when they issue requests that would bring their resource utilization out of the allowable profile. Our experimental results that show that we are able to provide the target resource limitations within a few percent, and do so with no measurable slowdown of the overall system. (Also cross-referenced as UMIACS-TR-2000-53) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Image Restoration through Subimages and Confidence Images. James G. Nagy. Dianne P. O'Leary. July 2000.
Some very effective but expensive image reconstruction algorithms cannot be applied to large images because of their cost. In this work, we first show how to apply such algorithms to subimages, giving improved reconstruction of regions of interest. Our second contribution is to construct confidence intervals for pixel values, by generalizing a theorem of O'Leary and Rust to allow both upper and lower bounds on variables. All current algorithms for image deblurring or deconvolution output an image. This provides an estimated value for each pixel in the image. What is lacking is an estimate of the statistical confidence that we can have in those pixel values or in the features they form in the image. There are two obstacles in determining confidence intervals for pixel values: first, the process is computationally quite intensive, and second, there has been no proposal for providing the results in a visually useful way. In this work we overcome the first of those limitations and use a recently developed algorithm called {\sf Twinkle} to overcome the second. We demonstrate the usefulness of these techniques on astronomical and motion-blurred images. (Also cross-referenced as UMIACS-TR-2000-52) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Displaying Confidence Images. James G. Nagy. Dianne P. O'Leary. July 2000.
Algorithms for computing images result in an estimate of an image. The image may result from deblurring a measured image, from deconvolving a set of measurements, or from computing an image by modeling physical processes such as the weather. These computations provide an estimated value for each pixel in the image. What is lacking, however, is an estimate of the statistical confidence that we can have in those pixel values or in the features they form. In this work we discuss novel ways to display confidence information, using an algorithm called {\sf Twinkle}, in order to give the viewer valuable visual insight into uncertainties. The technique is useful whether the confidence information is in the form of a confidence interval or a distribution of possible values. We demonstrate how to display confidence information in a variety of applications: weather forecasts, intensity of a star, and rating a potential tumor in a diagnostic image. (Also cross-referenced as UMIACS-TR-2000-51) Universty of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
AN ANALYSIS OF SMOOTHING EFFECTS OF UPWINDING STRATEGIES FOR THE. HOWARD C. ELMAN. ALISON RAMAGE. June 2000.
Using a technique for constructing analytic expressions for discrete solutions to the convection-diffusion equation, we examine and characterise the effects of upwinding strategies on solution quality. In particular, for grid-aligned flow and discretisation based on bilinear finite elements with streamline upwinding, we show precisely how the amount of upwinding included in the discrete operator affects solution oscillations and accuracy when boundary layers are present. In addition, we show that the same analytic techniques provide insight into other discretisations, such as a finite difference method that incorporates streamline diffusion, and the isotropic artificial diffusion method. (Also cross-referenced as UMIACS-TR-2000-50) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
SHOP and M-SHOP: Planning with Ordered Task Decomposition. Dana Nau. Yue Cao. Amnon Lotem. Hector Munoz-Avila. June 2000.
SHOP (Simple Hierarchical Ordered Planner) and M-SHOP (Multi-task-list SHOP) are planning algorithms with the following characteristics. * SHOP and M-SHOP plan for tasks in the same order that they will later be executed. This avoids some task-interaction issues that arise in other HTN planners, making the planning algorithms relatively simple. This also makes it easy to prove soundness and completeness results. * Since SHOP and M-SHOP know the complete world-state at each step of the planning process, they can use highly expressive domain representations. For example, they can do planning problems that require Horn-clause inferencing, complex numeric computations, and calls to external programs. * In our tests, SHOP and M-SHOP were several orders of magnitude faster than Blackbox, IPP, and UMCP, and were several times as fast as TLplan. * The approach is powerful enough to be used in complex real-world planning problems. For example, we are using a Java implementation of SHOP as part of the HICAP plan-authoring system for Noncombatant Evacuation Operations (NEOs). In this paper, we describe SHOP and M-SHOP, present soundness and completeness results for them, and compare them experimentally to Blackbox, IPP, TLplan, and UMCP. The results suggest that planners that generate totally ordered plans starting from the initial state can "scale up" to complex planning problems better than planners that use partially ordered plans. Department of Computer Science, University of Maryland,
NTCIR CLIR Experiments at the University of Maryland. Douglas W. Oard. Jianqiang Wang. June 2000.
This paper presents results for the Japanese/English cross-language informaiton retrieval task on teh NACSIS Test Collection. Two automatic dictionary-based query translation techniques were tried with four variants of the queries. The results indicate that longer queries outperform the required description only queries and that use of the first translation in the edict dictionary is comparable with the use of every translation. Japanese term segmentation posed no unusual problems, which contrasts sharply with results previously obtained for corss-language retrieval between Chinese and English. (Also cross-referenced as UMIACS-TR-2000-47, LAMP-TR-054) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
TREC-8 Experiments at Maryland: CLIR, QA and Routing. Douglas W. Oard. Jianqiang Wang. Dekang Lin. Ian Soboroff. June 2000.
The University of Maryland team participated in four aspects of TREC-8: the ad hoc retrieval task, the main task in the cross-language retrieval (CLIR) track, the question answering track, and the routing task in the filtering track. The CLIR method was based on Pirkola's method for Dictionary-based Query Translation, using freely available dictionaries. Broad-coverage parsing and rule-based matching was used for question answering. Routing was performed using Latent Semantic Indexing in profile space. (Also cross-referenced as UMIACS-TR-2000-46, LAMP-TR-053) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Structured Translation for Cross-Language Information Retrieval. Ruth Sperer. Douglas W. Oard. June 2000.
The paper introduces a query translation model that reflects the structure of the cross-language information retrieval task. The model is based on a structured bilingual dictionary in which the translations of each term are clustered into groups with distinct meanings. Query translation is modeled as a two-stage process, with the system first determining the intended meaning of a query term and then selecting translations appropriate to that meaning that might appear in the document collection. An implementation of structured translation based on automatic dictionary clustering is described and evaluated by using Chinese queries to retrieve English documents. Structured translation achieved an average precision that was statistically indistinguishable from Pirkola's technique for very short queries, but Pirkola's technique outperformed structured translation on long queries. The paper concludes with some observations on future work to improve retrieval effectiveness and on other potential uses of structured translation in interactive cross-language retrieval applications. (Also cross-referenced as UMIACS-TR-2000-45, LAMP-TR-052) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Mining the Web for Bilingual Text. P. Resnik. June 2000.
STRAND (Resnik, 1998) is a language-independent system for automatic discovery of text in parallel translation on the World Wide Web. This paper extends the preliminary STRAND results by adding automatic language identification, scaling up by orders of magnitude, and formally evaluating performance. The most recent end-product is an automatically acquired parallel corpus comprising 2491 English-French document pairs, approximately 1.5 million words per language. (Also cross-referenced as UMIACS-TR-2000-44) (Also cross-referenced as LAMP-TR-051) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Evaluating Lexicon Coverage for Cross-Language Information Retrieval. G. Levow. D.W. Oard. June 2000.
No abstract available (Also cross-referenced as UMIACS-TR-2000-43) (Also cross-referenced as LAMP-TR-050) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Signal Boosting for Translingual Topic Tracking: Document Expansion and. G. Levow and D.W. Oard. June 2000.
No abstract available (Also cross-referenced as UMIACS-TR-2000-42) (Also cross-referenced as LAMP-TR-049) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
A Statistical Word-Level Translation Model for Comparable Corpora. Mona Diab. Steve Finch. June 2000.
In this paper, we present a model of statistical word-level mapping for comparable corpora. The approach is based on the assumption that if two terms have close distributional profiles, their corresponding translations' distributional profiles should be close in a comparable corpus. The proposed model is described. A preliminary investigation on intralanguage comparable corpora is laid out. The preliminary results are >92% accurate, suggesting the feasibility of the model. The model needs to undergo some improvements and should be tested cross linguistically before assessing its significance. (Also cross-referenced as UMIACS-TR-2000-41, LAMP-TR-048) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Philip Resnik. Mona Diab. June 2000.
The way we model semantic similarity is closely tied to our understanding of linguistic representations. We present several models of semantic similarity, based on differing representational assumptions, and investigate their properties via comparison with human ratings of verb similarity. The results offer insight into the bases for human similarity judgments and provide a testbed for further investigation of the interactions among syn tactic properties, semantic structure, and semantic con tent. (Also cross-referenced as UMIACS-TR-2000-40, LAMP-TR-047) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
A Preliminary Statistical Investigation into the impact of an N-Gram Analysis Approach based on Word Syntactic Categories toward Text Author Classification. Mona Diab. John Schuster. Peter Bock. June 2000.
Quantitative analysis of literary style has heretofore utilized semantic elements-word counts. This research attempts to identify quantifiable syntactic elements of style that can be used for author identification. The measurement of syntactic elements utilizes a dictionary with one part of speech per word and looks at phrases delimited by punctuation marks. Different size permutations of words - referred to as grams - are counted within each text. Correlations are measured amongst the gram frequencies of eight texts pertaining to four authors, both contemporary and non-contemporary. The correlations are performed across different gram sizes of words. The same treatment is applied to a target text, the Funeral Elegy text. The approach holds for classifying texts temporally consistently across the various gram sizes. Yet a finer grained investigation is required to certify the authorship of the Funeral Elegy text. (Also cross-referenced as UMIACS-TR-2000-39, LAMP-TR-046) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Quantifying and Interpreting the Effect of Intelligent Information. Terry P. Riopka. Mona Diab. Peter Bock. June 2000.
A genetic algorithm is simulated using human beings as "chromosomes" in a preliminary study intended to quantify and interpret the effect of intelligent information exchange on genetic algorithm performance. Two factors are varied: the amount of information supplied to the cohort and the type of data manipulation allowed during the exchange. A human simulated genetic algorithm is run for each combination of factors as well as a machine simulation for comparison. Qualitative analysis of recorded conversations indicate extensive use of memory and development of block biases during genetic algorithm evolution. Informal analysis shows that genetic algorithm simulations using complex data manipulations combined with exact knowledge of string fitnesses seem to out-perform a standard machine implementation for the given optimization fitness function. Interestingly, polar combinations: simple data manipulation/minimum information and complex data manipulation/maximum information simulations seem to out-perform other combinations. (Also cross-referenced as UMIACS-TR-2000-38, LAMP-TR-045) University of Maryland Institute for Advanced Computer Studies, Dept. of Computer Science, Univ. of Maryland,
Chinese-English Semantic Resource Construction. Bonnie J. Dorr. Gina-Anne Levow. Dekang Lin. Scott Thomas. June 2000.
We describe an approach to large-scale construction of a semantic lexicon for Chinese verbs. We leverage off of three existing resources--a classification of English verbs called EVCA (English Verbs Classes and Alterations) [Levin, 1993], a Chinese conceptual database called HowNet [Zhendong, 1988c, Zhendong, 1988b] (http://www.how-net.com), and a large machine-readable dictionary called Optilex. The resulting lexicon is used for determining appropriate word senses in applications such as machine translation and cross-language information retrieval. (Also cross-referenced as UMIACS-TR-2000-27) (Also cross-referenced as LAMP-TR-044) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Construction of Chinese-English Semantic Hierarchy for Information. Gina-Anne Levow. Bonnie Dorr. Dekang Lin. June 2000.
This paper describes an approach to large-scale construction of a semantic hierarchy for Chinese verbs. Leveraging off of an existing Chinese conceptual database called HowNet and a Levin-based English verb classification, we use thematic-role information to create links between Chinese concepts and English classes. The resulting hierarchy is used for multilingual lexicons in an English-Chinese cross-language information retrieval application. We demonstrate a structured syntax interface that exploits this large-scale hierarchy and its linkages to WordNet for English-Chinese cross-language information retrieval. (Also cross-referenced asUMIACS-TR-2000-36) (Also cross-referenced as LAMP-TR-043) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
oxyGen: A Language Independent Linearization Engine. Nizar Habash. May 2000.
This paper describes a language independent linearization engine, oxyGen. This system compiles target language grammars into programs that take feature graphs as inputs and generate word lattices that can be passed along to the statistical extraction module of the generation system Nitrogen. The grammars are written using a flexible and powerful language, oxyL, that has the power of a programming language but focuses on natural language realization. This engine have been used successfully in creating an English linearization program that is currently used as part of a Chinese-English machine translation system. (Also cross-referenced as UMIACS-TR-2000-35) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Hashing Moving Objects. Zhexuan Song. Nick Roussopoulos. June 2000.
In real-life applications, the objects are both spatial and temporal referenced. The objects which continuously change their location are called moving objects. With the development of wireless communication and positioning technology, it becomes necessary to store and index those objects in database. Due to the complexity of the problem, many pure spatial index structures are unable to index large volume of moving objects in database. In this paper, we propose a whole new idea based on hashing technique. Since it is impossible to re-index all the objects after each time period, we store the objects in buckets. When an object moves within a bucket, the database does not make any change. By using this technique, the number of database update is greatly reduced which makes the index procedure feasible. Then, we extend the previous system structure by introducing a filter layer between the position information collectors and the database. Also four different methods based on the new system structure are presented. Performance experiments were performed to evaluate different aspects of our indexing techniques, and the conclusions are included in the paper. Department of Computer Science, University of Maryland
Broadening Access to Large Online Databases by Generalizing Query. E. Tanin. C. Plaisant. B. Shneiderman. May 2000.
Companies, government agencies, and other types of organizations are making their large databases available to the world over the Internet. Current database front-ends do not give users information about the distribution of data. This leads many users to waste time and network resources posing queries that have either zero-hit or mega-hit result sets. Query previews form a novel visual approach for browsing large databases. Query previews supply data distribution information about the database that is being searched and give continuous feedback about the size of the result set for the query as it is being formed. On the other hand, query previews use only a few pre-selected attributes of the database. The distribution information is displayed only on these attributes. Unfortunately, many databases are formed of numerous relations and attributes. This paper introduces a generalization of query previews. We allow users to browse all of the relations and attributes of a database using a hierarchical browser. Any of the attributes can be used to display the distribution information, making query previews applicable to many public online databases. (Also cross-referenced as UMIACS-TR-2000-32) (Also cross-referenced as HCIL-TR-2000-14) University of Maryland Institute for Advamced Computer Studies, Department of Computer Science, University of Maryland, Human-Computer Interaction Laboratory, University of Maryland,
Fisheye Menus. B. B. Bederson. May 2000.
We introduce "fisheye menus" which apply traditional fisheye graphical visualization techniques to linear menus. This provides for an efficient mechanism to select items from long menus, which are becoming more common as menus are used to select data items in, for example, e-commerce applications. Fisheye menus dynamically change the size of menu items to provide a focus area around the mouse pointer. This makes it possible to present the entire menu on a single screen without requiring buttons, scrollbars, or hierarchies. A pilot study with 10 users compared user preference of fisheye menus with traditional pull-down menus that use scrolling arrows, scrollbars, and hierarchies. Users preferred the fisheye menus for browsing tasks, and hierarchical menus for goal-directed tasks. (Also cross-referenced as UMIACS-TR-2000-31) (Also cross-referenced as HCIL-TR-2000-12) University of Maryland Institute for Advamced Computer Studies, Department of Computer Science, University of Maryland, Human-Computer Interaction Laboratory, University of Maryland,
Jazz: An Extensible Zoomable User Interface Graphics ToolKit in Java. B. B. Bederson. J. Meyer. L. Good. May 2000.
In this paper we investigate the use of scene graphs as a general approach for implementing two-dimensional (2D) graphical applications, and in particular Zoomable User Interfaces (ZUIs). Scene graphs are typically found in three-dimensional (3D) graphics packages such as Sun's Java3D and SGI's OpenInventor. They have not been widely adopted by 2D graphical user interface toolkits. To explore the effectiveness of scene graph techniques, we have developed Jazz, a general-purpose 2D scene graph toolkit. Jazz is implemented in Java using Java2D, and runs on all platforms that support Java 2. This paper describes Jazz and the lessons we learned using Jazz for ZUIs. It also discusses how 2D scene graphs can be applied to other application areas. (also cross-referenced as UMIACS-TR-2000-30) (Also cross-referenced as HCIL-TR-2000-13) University of Maryland Institute for Advamced Computer Studies, Department of Computer Science, University of Maryland, Human-Computer Interaction Laboratory, University of Maryland,
User Modeling for Information Access Based on Implicit Feedback. J. Kim. D. W. Oard. K. Romanik. May 2000.
User modeling can be used in information filtering and retrieval systems to improve the representation of a users information needs. User models can be constructed by hand, or learned automatically based on feedback provided by the user about the relevance of documents that they have examined. By observing user behavior, it is possible to infer implicit feedback without requiring explicit relevance judgments. Previous studies based on Internet discussion groups (USENET news) have shown reading time to be a useful source of implicit feedback for predicting a users preferences. The study reported in this paper extends that work by providing framework for considering alternative sources of implicit feedback, examining whether reading time is useful for predicting a users preferences for academic and professional journal articles, and exploring whether retention behavior can usefully augment the information that reading time provides. Two user studies were conducted in which undergraduate students examined articles and abstracts related to the telecommunications and pharmaceutical industries. The results showed that reading time could be used to predict the users assessment of relevance, although reading time for journal articles and technical abstracts are longer than has been reported for USENET news documents. Observation of printing events, a type of retention behavior, was found to provide additional useful evidence about relevance beyond that which could be inferred from reading time. The paper concludes with a brief discussion of the implications of the reported results. (Also cross-referenced as UMIACS-TR-2000-29) (Also cross-referenced as HCIL-TR-2000-11) University of Maryland Institute for Advamced Computer Studies, Department of Computer Science, University of Maryland, Human-Computer Interaction Laboratory, University of Maryland,
Navigational Issues in the Design of On-Line Self-Administered. K. L. Norman. Z. Friedman. K. Norman. R. Stevenson. May 2000.
Answering questions on surveys involves the access of internal cognitive knowledge structures, the retrieval of records from external data-bases, and the navigation of items on the computer interface. In this study a number of alternative designs for on-line questionnaire presentation were investigated. A long heterogeneous survey was partitioned in four ways: whole/form-based, semantic/section-based, screen/page-based, and single item-based. Questionnaires were presented with or without an index which resulted in eight versions. Times for initial completion of the questionnaire were recorded as well as subjective assessments. Neither initial completion times nor subjective assessments differed among the eight versions due to the highly linear navigation of the survey structures. Respondents were also asked to revisit 16 questions based on only the topic of the question or on the topic and the question number and to change their answers. Revision times reflected ease of finding items in the structure of the survey and the use of an index to the sections of the questionnaire. University of Maryland Institute for Advamced Computer Studies, Department of Computer Science, University of Maryland, Human-Computer Interaction Laboratory, University of Maryland,
DataCutter and A Client Interface for the Storage Resource Broker with. Tahsin Kurc. Michael Beynon. Alan Sussman. Joel Saltz. May 2000.
The continuing increase in the capabilities of high performance computers and continued decreases in the cost of secondary and tertiary storage systems is making it increasingly feasible to generate and archive very large (e.g. petabyte and larger) datasets. Applications are also increasingly likely to make use of archived data obtained by different types of sensors. Such sensors include imaging devices deployed on satellites and aircraft, microscopy related imagery and radiology related imagery. Simulation or sensor datasets generated or acquired by one group may need to be accessed over a wide-area network by other groups. Datasets frequently describe data associated with collections of very large structured or unstructured grids where each grid point is associated with several variables. Applications frequently need only to obtain portions of a dataset. Required data may correspond to a particular region in a multidimensional space. The application may need to access all data associated in a multidimensional region or it may need only certain variable values at a subsampled set of spatial locations. In addition, in some cases, applications may require data products obtained by aggregating data in one way or another. For instance, a user might require time or space averaged data. This document describes the design of a middleware infrastructure, called DataCutter, that enables subsetting and user-defined filtering of multi-dimensional datasets stored in archival storage systems across a wide-area network. We also describe a client API for Storage Resource Broker (SRB) clients, which allows SRB clients to carry out subsetting and filtering of datasets stored through the SRB. This API uses a prototype implementation of the DataCutter indexing and filtering services. (Also cross-referenced as UMIACS-TR-2000-26) University of Maryland Institute for Advamced Computer Studies, Department of Computer Science, University of Maryland,
The periodic polytope and its applications to a scheduling problem - A. K. Subramani. A. Agrawala. May 2000.
Parameter variability and the existence of complex constraints between tasks are assured features of real-time scheduling. {\em Periodicity} of task sets is an additional feature that needs to be accomodated. Traditional scheduling models ignore the complexities involved in real-time scheduling by making simplistic assumptions about task interactions. In this paper, we present a model that captures the issues that we deem central to real-time scheduling in periodic task sets and demonstrate the existence of efficient and easily implementable algorithms for addressing schedulability queries in this model. Our model is very general and applicable to diverse areas ranging from real-time process scheduling in operating systems and avionics to manufacturing and traffic control. (Also cross-referenced as UMIACS-TR-2000-25) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland\,
Extending User Understanding of Federal Statistics in Tables. Gary Marchionini. Carol Hert. Liz Liddy. Ben Shneiderman. May 2000.
This paper describes progress toward improving user interfaces for US Federal government statistics that are presented in tables. Based on studies of user behaviors and needs related to statistical tables, we describe interfaces to assist diverse users with a range of statistical literacy to explore, find, understand, and use US Federal government statistics. (HCIL-TR-2000-08) (Also cross-referenced UMIACS-TR-2000-24) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland, Human-Computer Interaction Laboratory, University of Maryland,
Direct Annotation: A Drag-and-Drop Strategy for Labeling Photos. B. Shneiderman. H. Kang. April 2000.
Annotating photos is such a time-consuming, tedious and error-prone data entry task that it discourages most owners of personal photo libraries. By allowing users to drag labels such as personal names from a scrolling list and drop them on a photo, we believe we can make the task faster, easier and more appealing. Since the names are entered in a database, searching for all photos of a friend or family member is dramatically simplified. We describe the user interface design and the database schema to support direct annotation, as implemented in our PhotoFinder prototype. (HCIL-2000-06) (Also cross-referenced as UMIACS-TR-2000-23) University of Maryland Institute for Advamced Computer Stdies, Human-Computer Interaction Laboratory, University of Maryland, Department of Computer Science, University of Maryland,
Snap-Together Visualization: A User Interface for Coordinating. C. North. B. Shneiderman. April 2000.
Multiple coordinated visualizations enable users to rapidly explore complex information. However, users often need unforeseen combinations of coordinated visualizations that are appropriate for their data. Snap-Together Visualization enables data users to rapidly and dynamically mix and match visualizations and coordinations to construct custom exploration interfaces without programming. Snap's conceptual model is based on the relational database model. Users load relations into visualizations then coordinate them based on the relational joins between them. Users can create different types of coordinations such as: brushing, drill down, overview and detail view, and synchronized scrolling. Visualization developers can make their independent visualizations snap-able with a simple API. Evaluation of Snap revealed benefits, cognitive issues, and usability concerns. Data savvy users were very capable and thrilled to rapidly construct powerful coordinated visualizations. A snapped overview and detail-view coordination improved user performance by 30-80%, depending on task. (Also cross-referenced as UMIACS-TR-2000-22) University of Maryland Institute for Advanced Computer Studies, Human-Computer Interaction Laboratory, University of Maryland, Department of Computer Science, University of Maryland,
An Arnoldi--Schur Algorithm for Large Eigenproblems. G. W. Stewart. April 2000.
Sorensen's iteratively restarted Arnoldi algorithm is one of the most successful and flexible methods for finding a few eigenpairs of a large matrix. However, the need to preserve structure of the Arnoldi decomposition, on which the algorithm is based, restricts the range of transformations that can be performed on it. In consequence, it is difficult to deflate converged Ritz vectors from the decomposition. Moreover, the potential forward instability of the implicit QR algorithm can cause unwanted Ritz vectors to persist in the computation. In this paper we introduce a generalized Arnoldi decomposition that solves both problems in a natural and efficient manner. (Also cross-referenced as UMIACS-TR-2000-21) University of Maryland Institute for Advanced Computer Studies), Department of Computer Science, University of Maryland,
Buffer Merging --- A Powerful Technique for Reducing Memory. P. K. Murthy. S. S. Bhattacharyya. April 2000.
In this paper, we develop a new technique called buffer merging for reducing memory requirements of synchronous dataflow (SDF) specifications. SDF has proven to be an attractive model for specifying DSP systems, and is used in many commercial tools like DSPCanvas, SPW, and COSSAP. Good synthesis from an SDF specification depends crucially on scheduling, and memory is an important metric for generating efficient schedules. Previous techniques on memory minimization have either not considered buffer sharing at all, or have done so at a fairly coarse level (the meaning of this will be made more precise in the paper). In this paper, we develop a buffer overlaying strategy that works at the level of an input/output edge pair of an actor. It works by algebraically encapsulating the lifetimes of the tokens on the input/output edge pair, and determines the maximum amount of the input buffer space that can be reused by the output. We develop the mathematical basis for performing merging operations, and develop several algorithms and heuristics for using the merging technique for generating efficient implementations. We show improvements of up to 54% over previous techniques. (Also cross-referenced as UMIACS-TR-2000-20) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Support for Speculative Update Propagation and Mobility in Deno. Ugur Cetintemel. Peter J. Keleher. Michael Franklin. November 1999.
This paper presents the transactional framework of Deno, an object replication system specifically designed for use in mobile and weakly-connected environments. Deno uses weighted voting for availability and pair-wise, epidemic information flow for flexibility. This combination allows the protocols to operate with less than full connectivity, to easily adapt to changes in group member-ship, and to make few assumptions about the underlying network topology. These features are all crucial to providing effective support for mobile and weakly-connected platforms. Deno has been implemented and runs on top of Linux and Windows NT/CE platforms. We use the Deno prototype to characterize the performance of two versions of Deno's protocol. The first ver-sion enables globally serializable execution of update transactions. The second supports a weaker consistency level that still guarantees transactionally consistent access to replicated data. The re-sults show that our protocols either outperform or perform comparably to existing approaches, while achieving higher availability. Further, we show that the incremental cost of providing global serializability in this environment is low. Finally, we show that commit delays can be sig-nificantly decreased by allowing votes to be cast, and votes and updates to be disseminated, speculatively. (Also cross-referenced as UMIACS-TR-99-70) UNiversity of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Large-Scale Construction of a Chinese-English Semantic Hierarchy. Bonnie J. Dorr. Gina-Anne Levow. Dekang Lin. June 2000.
This paper addresses the problem of building conceptual resources for multilingual applications. We describe new techniques for large-scale construction of a semantic hierarchy for Chinese verbs, using thematic-role information to create links between Chinese concepts and English classes. We then present an approach to compensating for gaps in the existing resources. The resulting hierarchy is used for a multilingual lexicon for Chinese-English machine translation and cross-language information retrieval applications. (Also cross-referenced as UMIACS-TR-2000-17) (Also cross-referemced as LAMP-TR-040) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
The Parametric Polytope and its applications to a Scheduling Problem. K. Subrmani. A. Agrawala. March 2000.
An important feature in Real-time systems is {\em parameter impreciseness} i.e. the inability to accurately determine certain parameter values. The most common such parameter is {\em task execution time}. A second feature is the presence of complex relationships between tasks that constrain their execution. Traditional models do not accomodate either feature completely: (a) Variable execution times are modeled through a fixed value ( {\em worst-case} ), and (b) Relationships are limited to those that can be represented by precedence graphs. We present a task model that effectively captures {\em variable task execution time}, while simultaneously permitting arbitrary linear relationships between tasks. Our model finds applications in diverse areas such as real-time task scheduling, compiler scheduling, real-time database scheduling and machine control. This paper focuses primarily on the computational complexity of answering queries posed in our model; in particular we demonstrate the existence of constraint classes that make the scheduling problem {\em hard.} (Also cross-referenced as UMIACS-TR-2000-16) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
A Characterisation of Oscillations in the Discrete Two-Dimensional. Howard C. Elman. Alison Ramage. March 2000.
It is well known that discrete solutions to the convection-diffusion equation contain nonphysical oscillations when boundary layers are present but not resolved by the discretisation. However, except for one-dimensional problems, there is little analysis of this phenomenon. In this paper, we present an analysis of the two-dimensional problem with constant flow aligned with the grid, based on a Fourier decomposition of the discrete solution. For Galerkin bilinear finite element discretisations, we derive closed form expressions for the Fourier coefficients, showing them to be weighted sums of certain functions which are oscillatory when the mesh P\'{e}clet number is large. The oscillatory functions are determined as solutions to a set of three-term recurrences, are then used to characterise the oscillations of the discrete solution in terms of the mesh P\'{e}clet number and boundary conditions of the problem. (Also cross-referenced UMIACS-TR-2000-15) University of Maryland Institute for Advanced Computer Studies, Department of Computer Svience, University of Maryland,
The Static Polytope and its applications to a scheduling problem. K. Subramani. A. Agrawala. March 2000.
In the design of real-time systems, it is often the case that certain process parameters ( such as {\em execution time} ) are not known precisely. The challenge in real-time system design is to develop techniques that efficiently meet the requirements of impreciseness. Traditional models tend to simplify the issue of impreciseness by assuming {\em worst-case} times. This assumption is unrealistic and at the same time, may cause certain constraints to be violated at run-time. In this paper, we shall study the problem of scheduling a set of ordered, non-preemptive processes under non-constant execution times. Typical applications for variable execution time scheduling include process scheduling in Real-time Operating Systems such as Maruti, compiler scheduling, database transaction scheduling and automated machine control. An important feature of application areas such as robotics is the interaction between execution times of various processes. We explicitly model this interaction through the representation of execution time vectors as points in convex sets. We present both sequential and parallel algorithms for determining the existence of a static schedule. (Also cross-referenced as UMIACS-TR-2000-14) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
A Dual intepretation of Standard Constraints in Parametric Scheduling. K. Subramani. A. Agrawala. March 2000.
The problem of parametric scheduling in hard real-time systems, ( in the presence of linear relative constraints between the start and execution times of tasks ) was posed in the litreature. In an earlier paper, a polynomial time algorithm is presented for the case when the constraints are restricted to be standard ( defined in paper ) and the execution time vectors belong to an axis-parallel hyper-rectangle. In this paper, we extend their results in two directions. We first present a polynomial time algorithm for the case when the execution time vectors belong to arbitrary convex domains. We then show that the set of standard constraints can be extended to include arbitrary network constraints. Our insights into the problem occur primarily as a result of studying the dual polytope of the constraint system. (Also cross-refernced as UMIACS-TR-2000-11) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Bolshoi - A Modeling Spreadsheet (Improving Usability of Complex. William C. Cheng. Leana Golubchik. March 2000.
Spreadsheet programs are very popular financial modeling tools because they allow users to juggle numbers and formulas with a powerful yet intuitive and easy to understand user interface; also, they often are equipped with sophisticated numerical analysis packages for data analysis and powerful presentation utilities for visualizing results. Computer systems performance and reliability modeling tools of today, on the other hand, have un-intuitive user interfaces and are difficult to learn and use. In this work, we propose to design, build, and evaluate Bolshoi, a modeling spreadsheet, with the goal of putting modeling tools comfortably in the hands of non-expert users. In this proposal, we address management of complexity that exists in performance and reliability analysis of real computer and communication systems. Specifically, we propose to do so through the design and development of an advanced modeling tool. Our tool will provide two important functions: (1) a proper interface for building models that will allow system designers not just to define their models, but visualize them in various ways and (2) easy plug-in of existing and future advanced solution techniques. We call this tool Bolshoi, a Modeling Spreadsheet, because it has a spreadsheet-type interface as detailed below. Performance evaluation of real systems is complex, suffers from scalability problems (or the so-called ``state explosion'' problem) and in many cases requires advanced computational techniques. Often, advanced computational techniques are based on exploitation of ``special structure'' in the models (the primary way to deal with state explosion besides getting a bigger machine). With large and complex models, these special structures are very expensive to expose automatically as it involves searching through a combinatorial number of permutations. Proper visualization of models can greatly assist in the discovery of these special structures so that state space reduction techniques can be applied. Discovery of special structure regularly contributes to many orders of magnitude in computational efficiency. Furthermore, models are often defined over infinite state spaces. We believe that a spreadsheet paradigm is ideal for visualizing such models. Without proper modeling tools, much effort and money is wasted by the computer industry, and moreover, the probability of a successful outcome is low. Thus, a good tool is crucial to advances in the state of the art in performance modeling as well as to successful design of systems in the industry. Every system designer should be able to integrate the use of a performance modeling tool into his/her design process. He/she should be able to easily ask ``what-if'' type questions, explore possible design choices, and make decisions based on quantitative results rather than ``gut feeling''. We believe that a modeling spreadsheet is the right abstraction for such tasks, and furthermore, to the best of our knowledge this abstraction has not been exploited for performance evaluation tool purposes. We believe that the approach proposed here will have a significant impact on future performance tool designs as well as make significant strides in wide-spread use of performance evaluation techniques among computer and communication system designers. Furthermore, a modeling tool that does not require expert-level methodology knowledge is also an excellent undergraduate-level and graduate-level educational tool. Opportunities for hands-on experience with modeling and performance evaluation as well as the ability to add new techniques to the tool greatly improve the educational experience of students and their future ability to apply what they have learned in class to design of real computer and communication systems. (Also cross-referenced as UMIACS-TR-2000-10) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Contention-conscious transaction ordering in embedded multiprocessors. Mukul Khandelia. Shuvra S. Bhattacharyya. March 2000.
This paper explores the problem of efficiently ordering interprocessor communication operations in statically-scheduled multiprocessors for iterative dataflow graphs. In most digital signal processing applications, the throughput of the system is significantly affected by communication costs. By explicitly modeling these costs within an effective graph-theoretic analysis framework, we show that ordered transaction schedules can significantly outperform self-timed schedules even when synchronization costs are low. However, we also show that when communication latencies are non-negligible, finding an optimal transaction order given a static schedule is an NP-complete problem, and that this intractability holds both under iterative and non-iterative execution. We develop new heuristics for finding efficient transaction orders, and perform an experimental comparison to gauge the performance of these heuristics. (Also cross-referenced as UMIACS-TR-2000-09) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Information Dynamics: An Information-Centric Approach to System Design". Ashok K. Agrawala. Ronald L. Larsen. Douglas Szajda. January 2000.
Acquisition, distribution, management, and analysis of information are the fundamental purposes behind most complex constructed systems and infrastructures, and yet a process centric approach is fundamental to the design and implementation of such systems. Since information is the essential commodity in these endeavors, we believe that an effective design should take into account the fundamental properties of information: it's characteristics, its fusion, its distillation, etc. Information Dynamics is an attempt to bring a degree of rigor to the understanding of the nature of information itself and how it is used in pursuit of system objectives. (Also cross-referenced as UMIACS-2000-08) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Designing StoryRooms: Interactive Storytelling Spaces for Children. Houman Alborzi. Allison Druin. Jaime Montemayor. Lisa Sherman. Gustav Taxn. Jack Best. Joe Hammer. Alex Kruskal. Abby Lal. Thomas Plaisant Schwenn. Lauren Sumida. Rebecca Wagner. Jim Hendler. February 2000.
Limited access to space, costly props, and complicated authoring technologies are among the many reasons why children can rarely enjoy the experience of authoring room-sized interactive stories. Typically in these kinds of environments, children are restricted to being story participants, rather than story authors. Therefore, we have begun the development of "StoryRooms," room-sized immersive storytelling experiences for children. With the use of low-tech and high-tech storytelling elements, children can author physical storytelling experiences to share with other children. In the paper that follows, we will describe our design philosophy, design process with children, the current technology implementation and example StoryRooms. (Also cross-referenced as UMIACS-TR-2000-06) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland, Human-Computer Interaction Laboratory,
MOCHA: A Self-Extensible Database Middleware System for Distributed. Manuel Rodriguez-Martinez. Nick Roussopoulos. January 2000.
This paper describes MOCHA, a new self-extensible database middleware system designed to interconnect data sources distributed over a computer network. MOCHA is designed to scale to large environments and is based on the idea that some of the user-defined functionality in the system should be deployed by the middleware itself. This is realized by shipping Java code implementing either advanced data types or tailored query operators to remote data sources and have it executed remotely. Optimized query plans push the evaluation of powerful data-reducing operators to the data source sites while executing data-inflating operators near the client's site. The Volume Reduction Factor is a new and more explicit metric introduced in this paper to select the best site to execute query operators and is shown to be more accurate than the standard selectivity factor alone. MOCHA has been implemented in Java and runs on top of Informix and Oracle. We present the architecture of MOCHA, the ideas behind it, and a performance study using data and queries from the Sequoia 2000 Benchmark. The results of this study demonstrate that MOCHA not only provides a flexible and scalable framework for distributed query processing but also substantially improves query performance in contrast to existing middleware solutions. (Also cross-referenced as UMIACS-TR-2000-05) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Design of a Framework for Data-Intensive Wide-Area Applications. Michael D. Beynon. Tahsin Kurc. Alan Sussman. Joel Saltz. February 2000.
Applications that use collections of very large, distributed datasets have become an increasingly important part of science and engineering. With high performance wide-area networks becoming more pervasive, there is interest in making collective use of distributed computational and data resources. Recent work has converged to the notion of the Grid, which attempts to uniformly present a heterogeneous collection of distributed resources. Current Grid research covers many areas from low level infrastructure issues to high level application concerns. However, providing support for efficient exploration and processing of very large scientific datasets stored in distributed archival storage systems remains a challenging research issue. We have initiated an effort that focuses on developing efficient data-intensive applications in a Grid environment. In this paper, we present a framework, called filter-stream programming, that represents the processing units of a data-intensive application as a set of filters, which are designed to be efficient in their use of memory and scratch space. We describe a prototype infrastructure that supports execution of applications using the proposed framework. We present the implementation of two applications using the filter-stream programming framework, and discuss experimental results demonstrating the effects of heterogeneous resources on application performance. (Also cross-referenced as UMIACS-TR-2000-04) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
A Clustering Scheme for Hierarchical Routing in Wireless Networks. Suman Banerjee. Samir Khuller. February 2000.
In this paper we present a clustering scheme to create hierarchies for wireless networks. A cluster is defined as a subset of vertices, whose induced graph is connected. In addition, a cluster is required to obey certain constraints that are useful for hierarchical routing. While all these constraints cannot be met simultaneously for general graphs, we show how for wireless network topologies, such a clustering can be obtained. We also present simulation results from a distributed implementation of this scheme to demonstrate its convergence and stability properties. Department of Computer Science, University of Maryland,
Optimizing Retrieval and Processing of Multi-dimensional Scientific. Chialin Chang. Tahsin Kurc. Alan Sussman. Joel Saltz. February 2000.
Exploring and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. We have been developing the Active Data Repository (ADR), an infrastructure that integrates storage, retrieval, and processing of large multi-dimensional scientific datasets on distributed memory parallel machines with multiple disks attached to each node. In earlier work, we proposed three strategies for processing range queries within the ADR framework. Our experimental results show that the relative performance of the strategies changes under varying application characteristics and machine configurations. In this work we investigate approaches to guide and automate the selection of the best strategy for a given application and machine configuration. We describe analytical models to predict the relative performance of the strategies when input data elements are uniformly distributed in the attribute space of the output dataset, restricting the output dataset to be a regular $d$-dimensional array. We present an experimental evaluation of these models for various synthetic datasets and for several driving applications on a 128-node IBM SP. (Also cross-referenced as UMIACS-TR-2000-03) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
IMPACTing SHOP: Foundations for integrating HTN Planning and. Hector Munoz-Avila. Juergen Dix. Dana S. Nau. Yue Cao. February 2000.
In this paper we describe a formalism for integrating the SHOP HTN planning system with the IMPACT multi-agent environment. Our formalism provides an agentized adaptation of the SHOP planning algorithm that takes advantage of IMPACT's capabilities for interacting with external agents, performing mixed symbolic/numeric computations, and making queries to distributed, heterogeneous information sources (such as arbitrary legacy and/or specialized data structures or external databases). We show that this agentized version of SHOP will preserve soundness and completeness if certain conditions are met. (This technical report is the updated version of CS-TR-4085) (Also cross-referenced as UMIACS-TR-2000-02) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
On the Eigensystems of Graded Matrices. G. W. Stewart. January 2000.
Informally a graded matrix is one whose elements show a systematic decrease or increase as one passes across the matrix. It is well known that graded matrices often have small eigenvalues that are determined to high relative accuracy. Similarly, the eigenvectors can have small components that are nonetheless well determined. In this paper, we give approximations to the eigenvalues and eigenvectors of a graded matrix in terms of a base matrix that show how these phenomena come about. This approach provides condition numbers for eigenvalues and individual components of the eigenvectors. The results are applied to derive related results for the singular value decomposition. (Also cross-referenced as UMAICS-TR-2000-01) University of Maryland Institute for Advanced Computer Studies, Department of Computer Sciece, University of Maryland,
A Generalization of Saad's Theorem on Rayleigh-Ritz. G. W. Stewart. December 1999.
Let $(\lambda,x)$ be an eigenpair of the Hermitian matrix $A$ of order $n$ and let $(\mu,u)$ be a Ritz pair from a subspace $\clk$ of $\comp^{2}$. Saad has given a simple inequality bounding $\sin\angle(x,u)$ in terms of $\sin\angle(x,\clk)$. In this note we show that this inequality can be extended to an equally simple inequality for eigenspaces of non-Hermitian matrices. (Also cross-referenced as UMIACS-TR-99-78) University of Maryland, Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Probabilistic Object Bases. Thomas Eiter. James Lu. Thomas Lukasiewicz. V.S. Subrahmanian. November 1999.
There are many applications where an object oriented data model is a good way of representing and querying data. However, current object database systems are unable to handle the case of objects whose attributes are uncertain. In this paper, extending previous pioneering work by Kornatzky and Shimony, we develop an extension of the relational algebra to the case of object bases with uncertainty. We propose concepts of consistency for such object bases, together with an NP-completeness result, and classes of probabilistic object bases for which consistency is polynomially checkable. In addition, as certain operations involve conjunctions and disjunctions of events, and as the probability of conjunctive and disjunctive events depends both on the probabilities of the primitive events involved as well as on what is known (if anything) about the relationship between the events, we show how all our algebraic operations may be performed under arbitrary probabilistic conjunction and disjunction strategies. We also develop a host of equivalence results in our algebra, which may be used as rewrite rules for query optimization. Last but not least, we have developed a prototype probabilistic object base server using the VisiBroker ORB on top of ObjectStore. We describe experiments to assess the efficiency of different possible rewrite rules. (Also cross-referenced as UMIACS-TR-99-77) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Designing Storytelling Technologies to Encourage Collabortion Between. Steve Benford. Benjamin B. Bederson. Karl-Petter Åkesson. Victor Bayon. Allison Druin. Pär Hansson. Juan Pablo Hourcade. Rob Ingram. Helen Neale. Claire O’Malle. Kristian T. Simsarian. Danaë Stanton. Yngve Sundblad. Gustav Taxén. November 1999.
We describe the iterative design of two collaborative storytelling technologies for young children, KidPad and the Klump. We focus on the idea of designing interfaces to subtly encourage collaboration so that children are invited to discover the added benefits of working together. This idea has been motivated by our experiences of using early versions of our technologies in schools in Sweden and the UK. We compare the approach of encouraging collaboration with other approaches to synchronizing shared interfaces. We describe how we have revised the technologies to encourage collaboration and to reflect design suggestions made by the children themselves. (Also cross-referenced as UMIACS-TR-99-76) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Single Display Groupware. Benjamin B. Bederson. Jason Stewart. Allison Druin. November 1999.
We discuss a model for supporting collaborative work between people that are physically close to each other. We call this model Single Display Groupware (SDG). In this paper, we describe the model, comparing it to more traditional remote collaboration. We describe the requirements that SDG places on computer technology, and our understanding of the benefits and costs of SDG systems. Finally, we describe a prototype SDG system that we built and the results of a usability test we ran with 60 elementary school children. Through participant observation, video analysis, program instrumentation, and an informal survey, we discovered that the SDG approach to collaboration has strong potential. Children overwhelmingly prefer two mice to one mouse when collaborating with other children. We identified several collaborative styles including a dominant partner, independent simultaneous use, a mentor/mentee relationship, and active collaboration. (Also cross-referenced as UMIACS-TR-99-75) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
IMPACTing SHOP: Foundations for integrating HTN Planning and. Hector Munoz-Avila. Juergen Dix. Dana S. Nau. Yue Cao. November 1999.
AI planning systems typically require that the state of the world be locally accessible. We call this the centralized state requirement. Furthermore, the state is described in a special representation language, mostly related to first-order logic. We refer to this as the uniform representation requirement. Relevant data from other sources must therefore be translated by hand into this language, stored in main memory and cannot be accessed automatically or as needed. These requirements, however, do not hold in many real-world domains. Information about the state may be distributed in several locations, each of which may have its own representation language. We address this problem by using a recently developed architecture for a Multi-Agent System, IMPACT, and its code-call mechanism. Within IMPACT queries and requests to arbitrary legacy and/or specialized data structures or external databases may be executed. We show in this paper how to combine the basic algorithm of a very efficient HTN planner, SHOP, with the code-call mechanism of IMPACT. This opens the way for SHOP to access real-world data and to base the planning process on external databases. We show that SHOP is sound and complete w.r.t. this extended data access. This technical report has been updated and revised and is available full-text/online as CS-TR-4100. (Also cross-referenced as UMIACS-TR-99-74) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Parameterized Modeling and Scheduling of Dataflow Graphs. Bishnupriya Bhattacharya. Shuvra S. Bhattacharyya. November 1999.
Dataflow has proven to be an attractive computational model for programming DSP applications. A restricted version of dataflow, called Synchronous Dataflow (SDF) is particularly well-suited for modeling a large class of signal processing applications, as it offers strong formal properties and compile-time predictability. However, the SDF model does not allow data-dependent flow of control or dynamically varying communication patterns between functional modules. This results in limited expressive power. Consequently, a variety of extensions to SDF have been developed, where the objective is to provide increased expressive power, while maintaining a significant part of the compile-time predictability of SDF. In this report, we propose a parameterized dataflow framework that can be applied as a meta-modeling technique to an arbitrary dataflow model that satisfies certain requirements, to further increase its expressive power. For clarity, we focus on synchronous dataflow, and develop the precise semantics of parameterized synchronous dataflow (PSDF). We propose a formal framework for the PSDF model, and introduce the concept of local synchrony, which is a condition that must be satisfied for consistent execution of PSDF specifications. From our experience, it appears that the PSDF model significantly increases the expressive power of pure SDF, while maintaining many of the desirable properties of SDF, like low-overhead scheduling (geared towards software synthesis in embedded systems). We develop techniques for implementing the operational semantics of PSDF that allows efficient quasi-static scheduling of a class of PSDF specifications. University of Maryland Institute for Advanced Computer Studies, Department of Electrical Engineering, University of Maryland, Department of Coomputer Science, University of Maryland,
A Convex Optimization Approach for Addressing Storage-Communication. Radha Poovendran. November 1999.
In Eurocrypt'99, Canetti, Malkin, and Nissim [1], presented a new tree based key distribution algorithm that required sublinear storage of keys while preserving logarithmic update communication as functions of the group size. The results in are known to be the first results presenting the sub-linear storage among the family of tree based key distribution schemes. The question of whether this storage was the possible optimal value while keeping the communication as logarithmic was posed as a problem. We show that the storage-communication tradeoff can be formulated as a convex optimization problem in terms of the size of the minimal storage parameter defined in. In particular, we show that the optimal solution is parameterizable by the ratio of the communication and storage costs, the degree of the tree, and the group size. Using this design triplet, we show that not only the results in [1] but also the results of the basic scheme of Wallner, Harder, and Agee [2] can be derived as specific Pareto optimal points for specific choice of the triplet. We also present an exact design procedure for feasibility testing and constructing optimal key distribution tree of the type in. We also show that if the communication and the storage are equally weighted, then the optimal value for storage and communication grows as square root of group size , a value noted in [1]. Department of Computer Science, University of Maryland,
Scheduling Jobs Before Shut Down. Vincenzo Liberatore. December 1999.
Distributed systems execute background or alternative jobs while waiting for data or requests to arrive from another processor. In those cases, the following shut-down scheduling problem arises: given a set of jobs of known processing time, schedule them on m machines so as to maximize the total weight of jobs completed before an initially unknown deadline. We will present optimally competitive deterministic and randomized algorithms for shut-down scheduling. Our deterministic algorithm is parameterized by the number of machines m. Its competitive ratio increases as the number of machines decreases, but it is optimal for any given choice of m. Such family of deterministic algorithm can be translated into a family of randomized algorithms that use progressively less randomization and that are optimal for the given amount of randomization. Hence, we establish a precise trade-off between amount of randomization and competitive ratios. Distributed systems execute background or alternative jobs while waiting for data or requests to arrive from another processor. In those cases, the following shut-down scheduling problem arises: given a set of jobs of known processing time, schedule them on m machines so as to maximize the total weight of jobs completed before an initially unknown deadline. We will present optimally competitive deterministic and randomized algorithms for shut-down scheduling. Our deterministic algorithm is parameterized by the number of machines m. Its competitive ratio increases as the number of machines decreases, but it is optimal for any given choice of m. Such family of deterministic algorithm can be translated into a family of randomized algorithms that use progressively less randomization and that are optimal for the given amount of randomization. Hence, we establish a precise trade-off between amount of randomization and competitive ratios. (Also cross-referenced as UMIACS-TR-99-72) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
SHOE: A Knowledge Representation Language for Internet Applications. Jeff Heflin. James Hendler. Sean Luke. October 1999.
It is our contention that the World Wide Web poses challenges to knowledge representation systems that fundamentally change the way we should design KR languages. In this paper, we describe the Simple HTML Ontology Extensions (SHOE), a KR language which allows web pages to be annotated with semantics. We present a formalism for the language and discuss the features which make it well suited for the Web. We describe the syntax and semantics of this language, and discuss the differences from traditional KR systems that make it more suited to modern web applications. We also describe some generic tools for using the language and demonstrate its capabilities by describing two prototype systems that use it. We also discuss some future tools currently being developed for the language. The language, tools, and details of the applications are all available on the World Wide Web at http://www.cs.umd.edu/projects/plus/SHOE. (Also cross-referenced as UMIACS-TR-99-71) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
A Security Infrastructure for Mobile Transactional Systems. Peter J. Keleher. Bobby Bhattacharjee. Kuo-Tung Kuo. Ugur Cetintemel. April 2000.
In this paper, we present an infrastructure for providing secure transactional support for mobile databases. Our infrastructure protects against external threats - malicious actions by nodes not authorized to access the data. The major contribution of this paper, however, is to classify and present algorithms to protect against internal security threats. Internal threats are malicious ac-tions by authenticated nodes that misrepresent protocol specific information. We quantify the cost of our security mechanisms in context of Deno: a system that supports object replication in a transactional framework for mobile and weakly-connected environments. Our results show that protecting against internal threats comes at a cost, but the marginal cost for protecting against larger cliques of malicious insiders is low. However, even with all the security mechanisms in place, our system commits updates over 50% faster than systems that depend on the Read-once Write-all commit protocol. Lastly, we present results from a probabilistic version of our algorithm that has several orders of magnitude lower computation cost than the traditional public-key based schemes. (Also cross-referenced as UMIACS-TR-2000-19) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
ViPEr-HiSS: A Case for Storage Design Tools. Leana Golubchik. Joseph Dunnick. Jeffrey K. Hollingsworth. October 1999.
The viability of large-scale multimedia applications, depends on the performance of storage systems. Providing cost-effective access to vast amounts of video, image, audio, and text data, requires (a) proper configuration of storage hierarchies as well as (b) efficient resource management techniques at all levels of the storage hierarchy. The resulting complexities of the hardware/software co-design in turn contribute to difficulties in making accurate predictions about performance, scalability, and cost-effectiveness of a storage system. Moreover, poor decisions at design time can be costly and problematic to correct in later stages of development. Hence, measurement of systems after they have been developed is not a desirable approach to predicting their performance. What is needed is the ability to evaluate the system's design while there are still opportunities to make corrections to fundamental design flaws. In this paper we describe the framework of ViPEr-HiSS, a tool which facilitates design, development, and subsequent performance evaluation of designs of multimedia storage hierarchies by providing mechanisms for relatively easy experimentation with (a) system configurations as well as (b) application- and media-aware resource management techniques. (Also cross-referenced as UMIACS-TR-99-69) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science , University of Maryland,
PETS: A Personal Teller of Stories. Jaime Montemayor. Allison Druin. Jim Hendler. November 1999.
Let us start by reading a story written by a seven year old child, entitled Michelle. "There once was a robot named Michelle. She was new in the neighborhood. She was HAPPY when she first came, thinking she would make friends. But it was the opposite. Other robots threw rocks and sticks. She was SAD. Now no one liked her. One day she was walking down a street, a huge busy one, when another robot named Rob came up and ask [sic] if she wanted to have a friend. She was SCARED at first but then realized that she was HAPPY. The other robots were ANGRY but knew that they had learned their lesson. Michelle and Rob lived HAPPILY ever after. No one noticed the dents from rocks that stayed on Michelle." (Druin, Research notes, August 1998) This is just one of many stories that children have written with the help of PETS (Druin et al. 1999a). The author of Michelle did not just write this moving story; she is also an integral member of the team that built our robots. As you read on, PETS will be further described. Our motivations behind building such an interactive robotic pet will also be discussed. In addition, the process of how we made this robotic technology with our team of adults and six children will be introduced. And with this, we will present cooperative inquiry (Druin 1999a), the methodology that we embrace as we discover insights about technology, education, science, engineering, and art. Finally, this chapter will close with reflections on what was learned from on-going research effort. (Also cross-referenced as UMIACS-TR-99-67) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Efficient Preconditioning of the Linearized Navier-Stokes Equations}. David Silvester. Howard Elman. David Kay. Andrew Wathen. October 1999.
We outline a new class of robust and efficient methods for solving subproblems that arise in the linearization and operator splitting of Navier-Stokes equations. We describe a very general strategy for preconditioning that has two basic building blocks; a multigrid V-cycle for the scalar convection-diffusion operator, and a multigrid V-cycle for a pressure Poisson operator. We present numerical experiments illustrating that a simple implementation of our approach leads to an effective and robust solver strategy in that the convergence rate is independent of the grid and the time-step, and only deteriorates very slowly as the Reynolds number is increased. (Also cross-referenced as UMIACS-TR-99-66) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Active Logics: A Unified Formal Approach to Episodic Reasoning. Jennifer Elgot-Drapkin. Sarit Kraus. Michael Miller. Madhura Nirkhe. Donald Perlis. October 1999.
Artificial intelligence research falls roughly into two categories: formal and implementational. This division is not completely firm: there are implementational studies based on (formal or informal) theories (e.g., CYC, SOAR, OSCAR), and there are theories framed with an eye toward implementability (e.g., predicate circumscription). Nevertheless, formal/theoretical work tends to focus on very narrow problems (and even on very special cases of very narrow problems) while trying to get them ``right'' in a very strict sense, while implementational work tends to aim at fairly broad ranges of behavior but often at the expense of any kind of overall conceptually unifying framework that informs understanding. It is sometimes urged that this gap is intrinsic to the topic: intelligence is not a unitary thing for which there will be a unifying theory, but rather a ``society'' of subintelligences whose overall behavior cannot be reduced to useful characterizing and predictive principles. Here we describe a formal architecture that is more closely tied to implementational constraints than is usual for formalisms, and which has been used to solve a number of commonsense problems in a unified manner. In particular, we address the issue of formal, integrated, and longitudinal reasoning: inferentially-modeled behavior that incorporates a fairly wide variety of types of commonsense reasoning within the context of a single extended episode of activity requiring keeping track of ongoing progress, and altering plans and beliefs accordingly. Instead of aiming at optimal solutions to isolated, well-specified and temporally narrow problems, we focus on satisficing solutions to under-specified and temporally-extended problems, much closer to real-world needs. We believe that such a focus is required for AI to arrive at truly intelligent mechanisms with the ability to behave effectively over considerably longer time periods and range of circumstances than is common in AI today. While this will surely lead to less elegant formalisms, it also surely is requisite if AI is to get fully out of the blocks-world and into the real world. (Also cross-referenced as UMIACS-TR-99-65) University of Maryland Institute for Advaced Computer Studies, Department of Computer Science, University of Maryland,
On Orthogonalization in the Inverse Power Method. G. W. Stewart. September 1999.
When the inverse power method is used to compute eigenvectors of a symmetric matrix corresponding to close eigenvalues, the computed eigenvectors may not be orthogonal. The cure for the problem is to orthogonalize the vectors using the Gram--Schmidt algorithm. In this note it is shown that the orthogonalization process does not cause the quality of the eigenvectors to deteriorate. Also cross-referenced as UMIACS-TR-99-64 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Evolving a Set of Techniques for OO Inspections. Forrest Shull. Guilherme H. Travassos. Jeffrey Carver. Victor R. Basili. October 1999.
Inspecting OO designs is an important way of ensuring the quality of software under development. When high-level design activities are finished, the design documents can be inspected to verify whether they are consistent among themselves and whether the software requirements were correctly and completely captured. This paper discusses some issues regarding the definition and application of reading techniques (i.e. procedural guidelines that can be given to inspectors) to inspect high-level OO design documents. An initial set of OO Reading Techniques and their experimental evaluation is described. A method for evaluating the reading techniques in more detail, i.e. Observational Techniques, is then presented, and experiences with its use are discussed. Through these discussions, we show how the reading techniques have evolved in response to empirical evidence (both qualitative and quantitative) regarding their use in practice. The complete and current set of techniques can be found in the appendices. (Also cross-referenced as UMIACS-TR-99-63) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Generating Efficient Stack Code for Java. Tatiana Shpeisman. Mustafa Tikir. October 1999.
Optimizing Java byte code is complicated by the fact that it uses a stack-based execution model. Changing the intermediate representation from the stack-based to the register-based one brings the problem of Java byte code optimizations into well-studied domain of compiler optimizations for register-based codes. In this paper we describe the technique to convert a register-based code into the Java byte code. The code generation techniques developed for the stack-based computers are not directly applicable to this problem as the comparative cost of the local memory and stack manipulation instructions in JVM is quite different from that in the stack-based computers. Naive verbose translation of the register-based code into the Java byte code produces the code with many redundant store and load instructions. The tool that we have developed allows to remove 90-100 \% of the stores to the local (i.e., non-global) variables. It produces the Java byte code that is slightly faster and shorter than the original byte code even when no optimizations except for register allocation are performed on the register-based code. Department of Computer Science, University of Maryland,
Secure Agents. Piero Bonatti. Sarit Kraus. V.S.Subrahmanian. October 1999.
With the rapid proliferation of software agents, there comes an increased need for agents to ensure that they do not provide data and/or services to unauthorized users. We first develop an abstract definition of what it means for an agent to preserve data/action security. Most often, this requires an agent to have knowledge that is impossible to acquire --- hence, we then develop approximate security checks that take into account, the fact that an agent usually has incomplete/approximate beliefs about other agents. We develop two types of security checks --- static ones that can be checked prior to deploying the agent, and dynamic ones that are executed at run time. We prove that a number of these problems are undecidable, but under certain conditions, they are decidable and (our definition of) security can be guaranteed. Finally, we propose a language within which the developer of an agent can specify her security needs, and present provably correct algorithms for static/dynamic security verification. (Also cross-refernced as UMIACS-TR-99-62) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Automatic Deployment of Application-Specific Metadata and Code in MOCHA. Manuel Rodriguez. Nick Roussopoulos. December 1999.
Database middleware systems require the deployment of application-specific data types and query operators to the servers and clients in the system. Existing middleware solutions rely on developers and system administrators to port and manually install all this application-specific functionality to all sites in the system. This approach cannot scale to an environment in which there are hundreds of data sources, such as those accessed by the Web and even more custom-tailored applications, since the complexity and the cost involved in maintaining a code base system-wide are enormous. This paper describes a novel metadata-driven framework designed to automate the deployment of all application-specific functionality used by a middleware system. We used Java and XML to implement this framework in MOCHA, a middleware system developed at the University of Maryland. We first present the kind of services, metadata elements and software tools used in MOCHA to automate code deployment. Then, we describe how the features of MOCHA simplify the administration and reduce the management cost of a middleware system in a large scale environment. (Also cross-refernced as UMIACS-TR-99-61) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Performance Benefits of Simultaneous over Sequential Menus as Task. H. Hochheiser. N. Kositsyna. G. Ville. B. Shneiderman. September 1999.
To date, experimental comparisons of menu layouts have concentrated on variants of hierarchical structures of sequentially presented menus. Simultaneous menus - layouts which present multiple active menus on a screen at the same time - are an alternative arrangement that may be useful in many web design situations. This paper describes an experiment involving a between-subject comparison of simultaneous menu and their traditional sequential counterparts. Twenty experienced web users used either simultaneous or sequential menus in a standard web browser to answer questions based on US Census data. For novice users performing simple tasks the simplicity of sequential menus appears to be helpful, but for most tasks and most users there is good evidence to believe that simultaneous menus speed performance and improve satisfaction. Design improvements can amplify the benefits of simultaneous menu layouts. (Also cross-referenced asUMIACS-TR-99-60) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland, Human-Computer Interaction Laboratory, University of Maryland,
Negative Cycle Detection in Dynamic Graphs. Nitin Chandrachoodan. Shuvra S.Bhattacharyya. K.J.Ray Liu. September 1999.
We examine the problem of detecting negative cycles in a dynamic graph, which is a fundamental problem that arises in electronic design automation and systems theory. Previous approaches used for this have tried to modify Dijkstra's algorithm since it is the fastest known Single-Source Shortest Path algorithm. We introduce the concept of {\em batch mode} negative cycle detection, in which a graph changes over time, and negative cycle detection needs to be done periodically. Such scenarios arise, for example, during iterative design space exploration for hardware and software synthesis. We present an algorithm for this problem, based on the Bellman-Ford algorithm, which outperforms previous approaches. We also show that this technique leads to very fast algorithms for the computation of the maximum-cycle mean (MCM) of a graph, especially for a certain form of {\em sparse graph}. Such sparseness often occurs in practice, as demonstrated for example by the ISCAS 89/93 benchmarks. We present experimental results that demonstrate the advantages of our batch-processing techniques, and illustrate their application to design-space exploration by developing an automated local-search technique for multiple-voltage scheduling of iterative data-flow graphs. (Also cross-referenced as UMIACS-TR-99-59) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Software synthesis and code generation for signal processing systems. S. S. Bhattacharyya. R. Leupers. P. Marwedel. September 1999.
No abstract submitted (Also cross-referenced as UMIACS-TR-99-57 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
The CBP parameter --- a useful annotation to aid SDF compilers. S. S. Bhattacharyya. P. K. Murthy. September 1999.
The role of software is becoming increasingly important in the implementation of DSP applications. As this trend intensifies, and the complexity of applications escalates, we are seeing an increased need for automated tools to aid in the development of DSP software. This paper reviews the state of the art in programming language and compiler technology for DSP software implementation. In particular, we review techniques for high level, block-diagram-based modeling of DSP applications; the translation of block diagram specifications into efficient C programs using global, target-independent optimization techniques; and the compilation of C programs into streamlined machine code for programmable DSP processors, using architecture-specific and retargetable back-end optimizations. In our review, we also point out some important directions for further investigation. (also cross-referenced as UMIACS-TR-99-56) University of Maryland Institute for Advanced Computer Syudies, Department of Computer Science, University of Maryland,
XMT-M: A Scalable Decentralized Processor. Efraim Berkovich. Joseph Nuzman. Manoj Franklin. Bruce Jacob. Uzi Vishkin. September 1999.
A defining challenge for research in computer science and engineering has been the ongoing quest for reducing the completion time of a single computation task. Even outside the parallel processing communities, there is little doubt that the key to further progress in this quest is to do parallel processing of some kind. A recently proposed parallel processing framework that spans the entire spectrum from (parallel) algorithms to architecture to implementation is the explicit multi-threading (XMT) framework. This framework provides: (i) simple and natural parallel algorithms for essentially every general-purpose application, including notoriously difficult irregular integer applications, and (ii) a multi-threaded programming model for these algorithms which allows an ``independence-of-order'' semantics: every thread can proceed at its own speed, independent of other concurrent threads. To the extent possible, the XMT framework uses established ideas in parallel processing. This paper presents XMT-M, a microarchitecture implementation of the XMT model that is possible with current technology. XMT-M offers an engineering design point that addresses four concerns: buildability, programmability, performance, and scalability. The XMT-M hardware is geared to execute multiple threads in parallel on a single chip: relying on very few new gadgets, it can execute parallel threads without busy-waits! Existing code can be run on XMT-M as a single thread without any modifications, thereby providing backward compatibility for commercial acceptance. Simulation-based studies of XMT-M demonstrate considerable improvements in performance relative to the best serial processor even for small, and therefore practical, input sizes. (Also cross-referenced as UMIACS-TR-99-55) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Cost Models for Query Processing Strategies in the Active Data. Chialin Chang. September 1999.
Exploring and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. We have been developing the Active Data Repository (ADR), an infrastructure that integrates storage, retrieval, and processing of large multi-dimensional scientific datasets on distributed memory parallel machines with multiple disks attached to each node. In earlier work, we proposed three strategies for processing range queries within the ADR framework. Our experimental results show that the relative performance of the strategies changes under varying application characteristics and machine configurations. In this work we describe analytical models to predict the average computation, I/O and communication operation counts of the strategies when input data elements are uniformly distributed in the attribute space of the output dataset, restricting the output dataset to be a regular d-dimensional array. We validate these models for various synthetic datasets and for several driving applications. Also cross-referenced as UMIACS-TR-99-54 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Hashing Technique: A New Index Method for High Dimensional Data. Zhexuan Song. Nick Roussopoulos. September 1999.
When dimension goes high, sequential scan processing becomes more efficient than most index-based query. In this paper, we propose a new index method for high-dimensional data spaces. This method is based on hashing technique. The basic idea is: First find a hashing function which puts the given d-dimensional space data into a d'-dimensional buckets where d' << d. Then, we use existing index techniques to manage those buckets. We later define some properties of a good hashing function and give four hashing functions. To demonstrate the efficiency of our idea, we experimentally compared our algorithms with sequential scan and Pyramid-Techniques. The results demonstrate that this method outperforms others for skewed data set. It always beats the sequential scan by using only half of elapsed time for range query. However if the data has uniform distribution, Pyramid-Technique is still the best method. Department of Computer Science, University of Maryland,
The Role of Children in the Design Technology. Allison Druin. September 1999.
Children play games, chat with friends, tell stories, study history or math, and today this can all be done supported by new technologies. From the Internet to multimedia authoring tools, technology is changing the way children live and learn. As these new technologies become ever more critical to our children's lives, we need to be sure these technologies support children in ways that make sense for them as young learners, explorers, and avid technology users. This may seem of obvious importance, because for almost 20 years the HCI community has pursued new ways to understand users of technology. However, with children as users, it has been difficult to bring them into the design process. Children go to school for most of their days; there are existing power structures, biases, and assumptions between adults and children to get beyond; and children, especially young ones have difficulty in verbalizing their thoughts. For all of these reasons, a child's role in the design of new technology has historically been minimized. Based upon a survey of the literature and my own research experiences with children, this paper defines a framework for understanding the various roles children can have in the design process, and how these roles can impact technologies that are created. (Also cross-referenced as UMIACS-TR-99-53) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Temporal Agent Programs. J. Dix. S. Kraus. V.S. Subrahmanian. September 1999.
The ``agent program'' framework introduced by Eiter, Subrahmanian and Pick (\textbf{Artificial Intelligence, 108(1-2), 1999}), supports developing agents on top of arbitrary legacy code. Such agents are continuously engaged in an \emph{``event occurs, think, act, event occurs''} cycle. However, this framework has two major limitations: (1) all actions are assumed to have no duration, and (2) all actions are taken now, but cannot be \emph{scheduled for the future}. In this paper, we present the concept of a ``temporal agent program'' (\tap for short) and show that using {\tap}s, it is possible to build agents on top of legacy code that can reason about the past and about the future, and that can make temporal commitments for the future now. We develop a formal semantics for such agents, extending the concept of a status set proposed by Eiter et al., and develop algorithms to compute the status sets associated with temporal agent programs. Last, but not least, we show how {\tap}s support classical negotiation methods (as well as some new ones) and classical auction methods (as well as some new ones). (Also cross-referenced as UMIACS-TR-99-51) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Probabilistic Agent Programs. Juergen Dix. Mirco Nanni. VS Subrahmanian. September 1999.
Agents are small programs that autonomously take actions based on changes in their environment or ``state.'' Over the last few years, there have been an increasing number of efforts to build agents that can interact and/or collaborate with other agents. In one of these efforts, Eiter, Subrahmanian amd Pick (AIJ, 108(1-2), pages 179-255) have shown how agents may be built on top of legacy code. However, their framework assumes that agent states are completely determined, and there is no uncertainty in an agent's state. Thus, their framework allows an agent developer to specify how his agents will react when the agent is 100\% sure about what is true/false in the world state. In this paper, we propose the concept of a \emph{probabilistic agent program} and show how, given an arbitrary program written in any imperative language, we may build a declarative ``probabilistic'' agent program on top of it which supports decision making in the presence of uncertainty. We provide two alternative semantics for probabilistic agent programs. We show that the second semantics, though more epistemically appealing, is more complex to compute. We provide sound and complete algorithms to compute the semantics of \emph{positive} agent programs. (Also cross-referenced as UMIACS-TR-99-50) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Meta Agent Programs. Juergen Dix. V.S. Subrahmanian. George Pick. September 1999.
There are numerous applications where an agent \aga needs to reason about the beliefs of another agent, as well as about the actions that other agents may take. In Eiter/Subrahmanian/Pick the concept of an agent program is introduced, and a language within which the operating principles of an agent can be declaratively encoded on top of imperative data structures is defined. In this paper we first introduce certain belief data structures that an agent needs to maintain. Then we introduce the concept of a \emph{Meta Agent Program} (\map), that extends the framework of Eiter/Subrahmanian/Pick, so as to allow agents to perform metareasoning. We build a formal semantics for \map{s}, and show how this semantics supports not just beliefs agent a may have about agent b's state, but also beliefs about agents b's beliefs about agent c's actions, beliefs about b's beliefs about agent c's state, and so on. Finally, we provide a translation that takes any \map as input and converts it into an agent program such that there is a one-one correspondence between the semantics of the \map and the semantics of the resulting agent program. This correspondence allows an implementation of \map{s} to be built on top of an implementation of agent programs. Also cross-referenced as UMIACS-TR-99-49 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Nonmonotonic Reasoning: Towards efficient calculi and implementations. Juergen Dix. Ulrich Furbach. Ilkka Niemelae. September 1999.
In this paper we do not want to give a detailed overview of the various formalizations of nonmonotonic reasoning that have evolved (those can be found in various textbooks), but we want to give an overview of the main computational techniques and methods leading to implementions of nonmonotonic reasoning. We first introduce the main nonmonotonic logics: \emph{Default Logic}, \emph{Circumscription} and \emph{Autoepistemic Logic}. We also consider the abstract approach of Kraus, Lehmann and Magidor to associate with any reasoning system an \emph{abstract consequence relation}. Then we investigate universal methods for computing in general nonmonotonic logics. We do this with a special eye on the underlying complexity and show how this lead to automated theorem proving in such logics. Finding efficient computation mechanisms for the logics introduced in the former section is the aim of the next Section. There we consider techniques that originated from automated reasoning in first-order predicate calculus. We depict how these techniques can be applied for disjunctive logic programming with programs with variables but only limited use of negation. In particular, we handle \ie{GCWA} as a basis for nonmonotonic negation therein. We then give a declarative overview on nonmonotonicity in logic programming. We introduce (nonmonotonic) semantics of logic programs with negation and disjunction, notably the well-founded and the stable semantics and their extensions to programs containing disjunction--- they constitute the most important semantics and are in close relation to the logics introduced in the next Section. While in we considered in a former section techniques that can be successfully applied for programs with variables and only limited use of negation, we also treat propositional programs with full negation and disjunction. In particular, we provide implementations of \mbox{D-WFS}\Index{D-WFS} and \ie{D-ST ABLE} in polynomial space. We end with a section where we consider the problem of finding good benchmarks to test and compare nonmonotonic systems against. Also cross-referenced as UMIACS-TR-99-48 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Explaining Updates by minimal sums. Juergen Dix. Karl Schlechta. September 1999.
Human reasoning about developments of the world involves always an assumption of \emph{inertia}. We discuss two approaches for formalizing such an assumption, based on the concept of an \emph{explanation}: \emph{(1)} there is a general preference relation given on the set of all explanations, \emph{(2)} there is a notion of a \emph{distance} between models and explanations are \emph{preferred} if their sum of distances is minimal. We show exactly under which conditions the converse is true as well and therefore both approaches are equivalent modulo these conditions. Our main result is a general representation theorem in the spirit of Kraus, Lehmann and Magidor. Also cross-referenced as UMIACS-TR-99-47 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
A General Theory of Confluent rewriting Systems for Logic Programming. Juergen Dix. Mauricio Osorio. September 1999.
Recently, Brass and Dix showed (\emph{Journal of Automated Reasoning} \textbf{20(1)}, 1998) that the wellfounded semantics WFS can be defined as a confluent calculus of transformation rules. This lead not only to a simple extension to disjunctive programs (\emph{Journal of Logic Programming} \textbf{38(3)}, 1999), but also to a new computation of the wellfounded semantics which is \emph{linear} for a broad class of programs. We take this approach as a starting point and generalize it considerably by developing a general theory of \emph{Confluent LP-Systems} $\cfs$. Such a system $\cfs$ is a rewriting system on the set of all logic programs over a fixed signature $\Lang$ and it induces in a natural way a canonical semantics. Moreover, we show four important applications of this theory: \emph{(1) most of the well-known semantics are induced by confluent LP-systems}, \emph{(2) there are many more transformation rules that lead to confluent LP-systems}, \emph{(3) semantics induced by such systems can be used to model aggregation}, \emph{(4) the new systems can be used to construct interesting counterexamples to some conjectures about the space of well-behaved semantics}. Also cross-referenced as UMIACS-TR-99-46 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Striping Doesn't Scale: How to Achieve Scalability. ChengFu Chou. Leana Golubchik. John C.S. Lui. September 1999.
Multimedia applications place high demands for QoS, performance, and reliability on storage servers and communication networks. These, often stringent, requirements make design of cost-effective and scalable continuous media (CM) servers difficult. In particular, the choice of data placement techniques can have a significant effect on the scalability of the CM server and its ability to utilize resources efficiently. In the recent past, a great deal of work has focused on ``wide'' data striping as a technique which ``implicitly'' solves load balancing problems; although, it does suffer from multiple shortcomings. Another approach to dealing with load imbalance problems is replication. The main focus of this paper is a study of scalability characteristics of CM servers as a function of tradeoffs between striping and replication. More specifically, striping is a good approach to load balancing while replication is a good approach to ``isolating'' nodes from being dependent on other system resources. The appropriate compromise between the degree of striping and the degree of replication is key to the design of a scalable CM server. This is the topic of our work. Also cross-referenced as UMIACS-TR-99-45 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
On Fault Location in Networks by Passive Testing. Raymond E. Miller. Khaled A. Arisha. August 1999.
In this paper, we employ a variant of the communicating finite state machine (CFSM) model for networks to investigate fault detection and location using passive testing. First, we introduce the concept of passive testing, then we introduce the model with necessary assumptions and justification. Then, the model for the observer process is described and a 3-node case is studied to show how fault location information can be deduced. Extending this result, we propose a multiple node-cut approach for a general network, applying our technique for fault detection and location. An abstraction of a node-cut shows how the 3-node case can be used in the general case. We then illustrate our technique through a simulation of a practical X.25 example. Finally future extensions and potential trends are discussed. Department of Computer Science, University of Maryland,
Universal Usability: Pushing Human-Computer Interaction Research to. Ben Shneiderman. July 1999.
"I feel... an ardent desire to see knowledge so disseminated through the mass of mankind that it may...reach even the extremes of society: beggars and kings." -- Thomas Jefferson, Reply to American Philosophical Society, 1808 In a fair society, all individuals would have equal opportunity to participate in, or benefit from, the use of computer resources regardless of race, sex, religion, age, disability, national origin or other such similar factors. -- ACM Code of Ethics Position Paper for National Science Foundation & European Commission meeting on human-computer interaction research agenda, June 1-4, 1999, Toulouse, France. To be published in book form. Also cross-referenced as UMIACS-TR-99-17 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland, Human-Computer Interaction Laboratory, University of Maryland,
Supporting Creativity with Advanced Information-Abundant User. Ben Shneiderman. June 1999.
A challenge for human-computer interaction researchers and user interface designers is to construct information technologies that support creativity. This ambitious goal can be attained if designers build on an adequate understanding of creative processes. This paper describes a model of creativity, the four-phase genex framework for generating excellence: - Collect: learn from previous works stored in digital libraries, the web, etc. - Relate: consult with peers and mentors at early, middle and late stages - Create: explore, compose, discover, and evaluate possible solutions - Donate: disseminate the results and contribute to the digital libraries, the web, etc. Within this integrated framework, there are eight activities that require human-computer interaction research and advanced user interface design. This paper concentrates on techniques of information visualization that support creative work by enabling users to find relevant information resources, identify desired items in a set, or discover patterns in a collection. It describes information visualization methods and proposes five questions for the future: generality, integration, perceptual foundations, cognitive principles, and collaboration. Also cross-referenced as UMIACS-TR-9942 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Improving Locality For Adaptive Irregular Scientific Codes. Hwansoo Han. Chau-Wen Tseng. September 1999.
An important class of scientific codes access memory in an irregular manner. Because irregular access patterns reduce temporal and spatial locality, they tend to underutilize caches, resulting in poor performance. Researchers have shown that consecutively packing data relative to traversal order can significantly reduce cache miss rates by increasing spatial locality. In this paper, we investigate techniques for using partitioning algorithms to improve locality in adaptive irregular codes. We develop parameters to guide both geometric (RCB) and graph partitioning (METIS) algorithms, and develop a new graph partitioning algorithm based on hierarchical clustering (GPART) which achieves good locality with low overhead. We also examine the effectiveness of locality optimizations for adaptive codes, where connection patterns dynamically change at intervals during program execution. We use a simple cost model to guide locality optimizations when access patterns change. Experiments on irregular scientific codes for a variety of meshes show our partitioning algorithms are effective for static and adaptive codes on both sequential and parallel machines. Improved locality also enhances the effectiveness of LocalWrite, a parallelization technique for irregular reductions based on the owner computes rule. Also cross-referenced as UMIACS-TR-99-41 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Empirical Studies in Parallel Sorting. Evan Golub. May 1998.
I examine different parallel algorithms for sorting in rounds. Most of these algorithms use a graph to indicate the comparisons to be made. The primary difference between the algorithms is how these graphs are chosen. One uses graphs that are shown to exist using non-constructive techniques, several yield constructions of the required graphs, and one uses a randomized algorithm. The constructive algorithms would traditionally be preferred even though the processor requirements are higher. It is shown that the non- constructive algorithms can actually be used by generating the needed graphs using random number generators skewed appropriately. Department of Computer Science, University of Maryland,
Mathematical Modeling of Lateralization and Asymmetries in Cortical Maps. Svetlana Levitan. July 1999.
Recent experimental work in neurobiology has defined asymmetries and lateralization in the topographic maps found in mirror-image regions of the sensorimotor cerebral cortex. However, the mechanisms underlying these asymmetries are currently not established, and in some cases are quite controversial. In order to explore some possible causes of map asymmetry and lateralization, several neural network models of cortical map lateralization and asymmetries based on self-organizing maps are created and studied both computationally and theoretically. Activation levels of the elements in the models are governed by large systems of highly nonlinear ordinary differential equations (ODEs), where coefficients change with time and their changes depend on the activation levels. Special metrics for objective evaluation of simulation results (represented as paired receptive field maps) are introduced and analysed. The behavior of the models is studied when their parameters are varied systematically and also when simulated lesions are introduced into one of the hemispheric regions. Some very sharp transitions and other interesting phenomena have been found computationally. Many of these computationally observed phenomena are explained by theoretical analysis of total hemispheric activation in a simplified model. The connection between a bifurcation point of the system of ODEs and the sharp transition in the model's computational behavior is established. More general understanding of topographic map formation and changes under various conditions is achieved by analysis of activation patterns (i.e., $\omega$-limit sets of the above system of ODEs). This is the first mathematical model to demonstrate spontaneous map lateralization and asymmetries, and it suggests that such models may be generally useful in better understanding the mechanisms of cerebral lateralization. The mathematical analysis of the models leads to a better understanding of the mechanisms of self-organization in the topographic maps based on competitive distribution of activation and competitive learning. Also cross-referenced as UMIACS-TR-99-40 University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
A Multigrid Method Enhanced by Krylov Subspace Iteration for Discrete. Howard C. Elman. Oliver G. Ernst. Dianne P. O'Leary. June 1999.
Standard multigrid algorithms have proven ineffective