University of Maryland

Capital Area Theory Seminar: Fall 2004

The Capital Area Theory Symposia is an UMIACS sponsored series of symposia in theoretical computer science bringing computer scientists from around the world to the Capital area. The Symposia are given at the University of Maryland in cooperation with the Computer Science Department.

Schedule

CATS will usually meet on Friday 2:00pm-3:00pm AVW 3258 but some talks may be at a different place and time.

Date Time Speaker Title
Dec 10 2:00pm Raja Jothi
National Center for Biotechnology Information (NCBI/NLM/NIH)
Effects of Evolutionary Tree Topology on Predicting Protein Interaction Specificity
Nov 19 1:00pm Srinivasan Parthasarathy
University of Maryland
Multicommodity Max-Flow Min-Cut Theorems and Their Use in Designing Approximation Algorithms
Nov 12 2:00pm Konstantin Andreev
Carnegie Mellon University
Balanced Graph Partitioning
Nov 5 2:00pm Iraj Saniee
Bell Laboratories
Design and Performance of Randomized Network Schedules for Time-domain Wavelength Interleaved Networks
Oct 29 2:00pm Christian Scheideler
Johns Hopkins University
Actors: A new/old framework for distributed computing
Oct 22 2:00pm Nan Wang
University of Maryland
Analyzing & Modeling Complex Networks
Oct 15 2:00pm Srinivas Kashyap
University of Maryland
Learning Markov Networks: Maximum Bounded Tree-Width Graphs
Oct 08 2:00pm Uzi Vishkin
University of Maryland
Can Parallel Computing Finally Impact Main Stream Computing?
Oct 01 1:00pm Azarakhsh Malekian
University of Maryland
Algorithmic Aspects of Treewidth
Sep 24 2:00pm Elena Zotenko
University of Maryland
Approximate Protein Structure Alignment in Polynomial Time
Sep 17 2:00pm Guilherme Fonseca
University of Maryland
Kinetic Priority Queues
Sep 10 2:00pm Julián Mestre
University of Maryland
Linear time approximation algorithms for the maximum weight matching problem
Sep 03 2:00pm   Planning meeting

For additional information send email to Samir Khuller.

If you want to receive announcements of upcoming talks join the theory-local mailing list.


Talks from previous semesters


Other cats

A cat sleeping

Abstracts

Linear time approximation algorithms for the maximum weight matching problem

Julián Mestre

ABSTRACT

The problem of finding a maximum weight matching in a graph can be solved exactly in polynomial time, the fastest known algorithms for solving the problem are involved and run in super-linear time.

I will present two linear time algorithms for appoximating the maximum weight matching problem. A simple 1/2-factor that runs in O(E) and a more involved algorithm with an approximation guarantee of 2/3 - c and running time O(E log (1/c)).


Kinetic Priority Queues

Guilherme Fonseca

ABSTRACT

A kinetic priority queue is a data structure which computes the maximum of a collection of continuously changing numbers subject to insertions and deletions. We will present two kinetic priority queues, the kinetic heap and the kinetic hanger. The kinetic heap is a natural kinetization of the binary heap, but its time complexity remains mostly unknown. The kinetic hanger is similar to the kinetic heap, but it uses randomization to balance the tree. We will show that the probabilistic nature of the hanger allows us to determine tight bounds to its time complexity.


Approximate Protein Structure Alignment in Polynomial Time

Elena Zotenko

ABSTRACT

Alignment of protein structures is a fundamental task in computational molecular biology. Good structural alignments can help detect distant evolutionary relationships that are hard or impossible to discern from protein sequences alone.

Here, we study the structural alignment problem as a family of optimization problems and develop an approximate polynomial-time algorithm to solve them. For a commonly used scoring function, the algorithm runs in O(n^{10}/\epsilon^6 time, for globular protein of length n, and it detects alignments that score within an additive error of from all optima. Thus, we prove that this task is computationally feasible, although the method that we introduce is too slow to be a useful everyday tool. We show that an alternative approach, which relies on internal distance matrices, must incorporate sophisticated geometric ingredients if it is to guarantee optimality and run in polynomial time.

Our investigations yield insights on the computational complexity of protein alignment under various scoring functions. These insights can be used in the design of scoring functions for which the optimum can be approximated efficiently and perhaps in the development of efficient algorithms for the multiple structural alignment problem.

Reference: Rachel Kolodny and Nathan Linial "Approximate Protein Structure Alignment in Polynomial Time"


Algorithmic Aspects of Treewidth

Azarakhsh Malekian

ABSTRACT

Tree decomposition problem is the problem of splitting a graph into subgraphs using cutsets corresponding to the nodes of a given tree. If one can bound the size of these cutsets one will be able to solve some optimization problems on graphs efficiently by using the divide-and-conquer method.

I will present the formal definition of treewidth and tree decomposition, and discuss an example that shows how to solve a graph problem by using the bounded width rooted tree decomposition of that graph.


Can Parallel Computing Finally Impact Main Stream Computing?

Uzi Vishkin

ABSTRACT

Come learn about the PRAM-On-Chip research project. The project offers significant opportunities for original work (good for finishing your PhD...) and impact (helps finding a job in academia and industry after you graduate), as well as being part of an energetic team (fun while you are doing it).

The idea of upgrading performance and utility of computer systems by incorporating parallel processing has been around since at least the 1940s. A significant investment in parallel processing in the 1980s and 1990s has led to an abundance of parallel architectures that due to technical constraints at the time had to rely on multi-chip multi-processing. Unfortunately, their impact on main stream computing was quite limited. "Tribal lore" suggests the following reason: while programming for parallelism tends to be easy, turning this parallelism into the "coarse-grained" type, needed to optimize performance for multi-chip multi-processing (with their high coordination overhead), has been quite hard.

The main stream computing paradigm has always relied on serial code. In spite of a huge multi-decade effort in the relevant computer science academia and industry to extract instruction level parallelism from serial code, commercially successful processors are entering a second decade of near stagnation in the maximum number of instructions they can issue towards a single computational task in one clock.

The PRAM-On-Chip project at UMD is led by a fact, an old insight and a new one.

(i) Billion transistor chips are here, up from less than 30,000 circa 1980.

(ii) Using a very simple parallel computation model, the parallel random access model (PRAM), the parallel algorithms research community has succeede in developing a theory of parallel algorithms and that theory appears to be second in magnitude only to serial algorithmics. (Unfortunately, this elegant algorithmic theory remained in the ivory towers of theorists, since it did not offer an effective abstraction for multi-chip multi-processors.)

(iii) The Billion transistor chip era allows for the first time low-overhead on-chip multi-processors so that the PRAM abstraction becomes effective.

The PRAM-On-Chip team includes now a post-doc, 4 graduate, and 3 undergraduate students. Current and future efforts are focused on several tracks.

1. Hardware and architecture research and prototyping.

2. Software systems research and prototyping.

3. Algorithms and performance programming research.

4. Application programming research.

A separate DARPA-funded programmer's productivity (development time) study is conducted by Dr. Basili's group.

Perhaps surprisingly, some of the most remarkable research opportunities are for students whose focus is applications: will it be possible to use/tune/develop an API (application programming interface) for an important application with improved performance and productivity? In particular, the following vision for an API is quite appealing: (i) its programming is so much easier than C programming that it could be done effectively by a non-computer-scientist application expert, and (ii) its performance on the PRAM-On-Chip platform is better than the best serial implementation.


Learning Markov Networks: Maximum Bounded Tree-Width Graphs

Srinivas Kashyap

ABSTRACT

Markov networks are a common class of graphical models used in machine learning. Such models use an undirected graph to capture dependency information among random variables in a joint probability distribution. Once one has chosen to use a Markov network model, one aims to choose the model that "best explains" the data that has been observed--this model can then be used to make predictions about future data.

We show that the problem of learning a maximum likelihood Markov network given certain observed data can be reduced to the problem of identifying a maximum weight low-treewidth graph under a given input weight function. We give the first constant factor approximation algorithm for this problem. More precisely, for any fixed treewidth objective k, we find a treewidth-k graph with an f(k) fraction of the maximum possible weight of any treewidth-k graph.

Reference: "Learning Markov Networks: Maximum Bounded Tree-Width Graphs" by David Karger and Nathan Srebro (SODA 01)


Analyzing & Modeling Complex Networks

Nan Wang

ABSTRACT

Systems taking the form of networks (also called "graphs") abound in the world. Examples include the Internet, the WWW, social networks, neural networks, networks of business relations between companies, etc. This talk is a "biased" survey talk which covers only part of the extensive works done on "complex networks" by researchers from various disciplines. The selection and presentation of topics in this talk are from a computer scientist's point of view, and are outlined as follows:

1. Structures of networks in the real world, e.g., social networks, information networks, biological networks, etc.

2. Dynamic processes taking place on networks, e.g., search on networks, disease spreading within communities, propagation of random or selected failures in networks, etc.

3. Two aspects of random models for complex networks, i.e., matching pre-assigned configurations (e.g., various random graph models with prescribed parameters), versus local rules yielding global structures (e.g., preferential attachment model, HOT, etc.).

4. The role of simulation in complex networks.

5. Algorithmic aspect of complex networks.

There are far too many references to list here. The organization of this talk does not follow any single existing material. Part of the relevant topics can be found in the survey paper:

"The structure and function of complex networks" SIAM Review 45, 167-256 (2003). Mark E. J. Newman.


Actors: A new/old framework for distributed computing

Christian Scheideler

ABSTRACT

There are two basic distributed computing approaches: viewing the network as a computer, or viewing the computer(s) as a network. Classical distributed computing views the network as a computer, that is, the basic von Neumann paradigm of separating code from data is also applied to distributed computing. Keeping code and data as separate entities has turned out to create many hard research problems in this field. In fact, basic issues such as correctness and efficiency are so difficult to handle that only a small community of experts really understands how to write correct and/or efficient distributed programs in this model. Does it have to be that way? In order to address this question, we will look at the alternative approach: viewing the computer(s) as a network. Here, the basic idea is not to separate between code and data any more. Instead, the code and data is organized into a collection of atomic, autonomous threads with their own, private memory that are interconnected with each other.

Among the first that proposed this approach was Carl Hewitt. He developed a model in the 1970s in which these threads are called actors. It was originally designed for the area of artificial intelligence, because distributed computing was not yet around. Unfortunately, hardware and software limitations at that time prevented this model from becoming wide-spread, but with the recent, tremendous growth of interest in peer-to-peer systems and applications, his model appears to experience a renaissance. I will propose an adaptation of Hewitt's model to the world of distributed computing and argue that this model can serve as a formal basis for the design and analysis of distributed applications that is much more useful in practice than previous models in distributed computing.


Design and Performance of Randomized Network Schedules for Time-domain Wavelength Interleaved Networks

Iraj Saniee

ABSTRACT

We'll present a network architecture that provides transport services for synchronous (fixed-rate) and asynchronous (bursty) traffic. Briefly, each node has a tunable laser and a burst-mode receiver to communicate with other nodes. End nodes are identified by their declared wavelengths. The interior nodes route burst passively based on the wavelength.

Though this architecture eliminates important networking functions (eg routing and dimensioning), it introduces a new challenge: network scheduling. Network scheduling will be needed to arbitrate burst transmissions from sources to destinations to avoid or minimize conflicts and maximize efficiency. Because of significant propagation delays inherent in networks, network scheduling turns out to be distinct from bipartite matching, max-weight matching or other node/switch scheduling schemes.

In this talk, we'll discuss distributed protocols that perform scheduling of bursts in such a network. The proposed protocols use randomization for reassignments of time slots.

We'll develop an analytical formulation of these schemes based on a new occupancy model and investigate their performance under two broad types of traffic: fixed load traffic and bursty traffic.

The analytical results show worst case efficiency of 33% and 67% for main flavors of these protocols for bursty and unpredictable traffic (thus outperforming slotted Aloha for the single destination). When network load is nearly static, simulation is used to show that with learning, these distributed protocols converge to centralized schedules with nearly optimal efficiency.

Joint work with John Morrison and Indra Widjaja.


Balanced Graph Partitioning

Konstantin Andreev

ABSTRACT

In this talk we consider the problem of $(k,\nu)$-balanced graph partitioning - dividing the vertices of a graph into $k$ almost equal size components (each of size less than $\nu \cdot \frac{n}{k}$) so that the capacity of edges between different components is minimized.

This problem is a natural generalization of several other problems such as minimum bisection, which is the $(2,1)$-balanced partitioning problem. Other previous work has considered the $(k,\nu)$-balanced partitioning problem for $\nu \geq 2$.

We present a bicriteria polynomial time approximation algorithm with an $O(\log^{3/2}{n})$-approximation for any constant $\nu>1$. For $\nu =1$ we show that no polytime approximation algorithm can guarantee a finite approximation ratio unless $P=NP$. This is joint work with Harald Raecke.


Multicommodity Max-Flow Min-Cut Theorems and Their Use in Designing Approximation Algorithms

Srinivasan Parthasarathy

ABSTRACT

In this paper, we establish max-flow min-cut theorems for several important classes of multicommodity flow problems. In particular, we show that for any n-node multicommodity flow problem with uniform demands, the max-flow for the problem is within an O(log n) factor of the upper bound implied by the min-cut. The result (which is existentially optimal) establishes an important analogue of the famous 1-commodity max-flow min-cut theorem for problems with multiple commodities. The result also has substantial applications to the field of approximation algorithms. For example, we use the flow result to design the first polynomial-time (polylog n-times-optimal) approximation algorithms for well-known NP-hard optimization problems such as graph partitioning, min-cut linear arrangement, crossing number, VLSI layout, and minimum feedback arc set. Applications of the flow results to path routing problems, network reconfiguration, communication in distributed networks, scientific computing and rapidly mixing Markov chains are also described in the paper.

Reference: Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. Tom Leighton and Satish Rao. Journal of the ACM, Vol 46, Issue 6, pages: 787 - 832, Nov 1999.


Effects of Evolutionary Tree Topology on Predicting Protein Interaction Specificity

Raja Jothi

ABSTRACT

To perform their function in the cell, proteins need to interact with each other. Predicting protein-protein interactions plays an important role in understanding protein functionalities within the cell. Proteins, in general, can be divided into groups of evolutionarily related proteins called families. It has been observed, that if members of one protein family interact with members of another protein family then the evolution of the two families is correlated. It is also believed that if two families of proteins co-evolved, it is highly likely that those two families interact. Several computational approaches that compare the evolutionary trees of proteins in order to predict potential interactions have been proposed in recent years. A much harder problem is to predict interaction specificity (matching members of one family to specific members of the other family), a largely unsolved problem. In this talk, I will explain our study on the effectiveness of evolutionary tree topology on predicting protein interaction specificities.

Joint work with Maricel Kann and Teresa Przytycka


Algorithms and Theory Group @ University of Maryland / Webmaster: Julian Mestre