# Capital Area Theory Seminar: Spring 2008

 The Capital Area Theory Symposia is an UMIACS sponsored series of symposia in theoretical computer science bringing computer scientists from around the world to the Capital area. The Symposia are given at the University of Maryland in cooperation with the Computer Science Department.

### Schedule

CATS usually meets on Fridays 1:00pm- 2:00 pm  AVW 4185  but some talks may be at a different place and time.

Date Time Location Speaker Title
02/01/08 1:00 pm AVW 4185 The Erdos Distance Problem
02/08/08 1:00 pm AVW 4185 The Erdos Distance Problem: the \Omega(n^{4/5}) result.
02/15/08 1:00 pm AVW 4185 Finding a maximum  density subgraph
02/22/08 1:00 pm AVW 4185 Sum of Subtree Clustering
02/29/08 1:00 pm AVW 4185 Set Cover and Maximum Coverage with Group Budgets
03/07/08 12:00 pm AVW 3258 Minkyoung Cho Embedding to Tree Metrics
03/14/08 2:00 pm AVW 3258 A proof of the Gilbert-Pollack's conjecture on the Steiner ratio
03/28/08 1:00 pm AVW 4185 Covering Minimum Spanning Trees of Random Subgraphs
04/04/08 1:00 pm AVW 3258 A Learning Theory Approach to Non-Interactive Database Privacy
04/18/08 1:00 pm AVW 4185 Tree Decomposable Models for Efficient Bioinformatics Algorithms
04/25/08 1:00 pm AVW 4185 Bradford Greening Proofs that Really Count: The Art of Combinatorial Proof
05/09/08 1:00 pm AVW 4185 Complete Fairness in Secure Two-Party Computation

For additional information send email to Samir Khuller.

If you want to receive announcements of upcoming talks join the theory-local mailing list.

### Abstracts

#### The Erdos Distance Problem

William Gasarch

ABSTRACT

In 1946 Erdos began the study of the following problem: Given n points in the plane, how many different distances are you guaranteed?
We denote this by g(n). Clearly g(n) \le n-1 (n points in a line yield n-1 different distances.) Erdos showed g(n) = \Omega(\sqrt n) and g(n) = O(n/\sqrt(log n))
Over the years larger and larger lower bounds for g(n) have been proven (there is a website of all such papers: www.cs.umd.edu/~gasarch/erdos_dist/erdos_dist.html).
In this talk we show  g(n) = \Omega(n^{2/3}). (Result originally due to Chung (1984) , but we give a proof due to Szeleky (1993) .)
The tools used to prove this result have been further improved to obtain g(n) = \Omega(n^{4/5}) which we might do in a later talk.

The Erdos Distance Problem: the \Omega(n^{4/5}) result.

William Gasarch

ABSTRACT

Given n points in the plane, how many different distances are we GUARANTEED? If they are all on a line, equal distance apart, then we get n-1 = Theta(n) distances.
If they are all on a line, but the distances keep doubling, then we get (n choose 2) = Theta(n^2) distances.
If they are in a uniform sqrt(n) x sqrt(n) grid, then we get O(n/sqrt(\log n)) distances (this is not obvious).
We are concerned with what are we GUARANTEED?
Last week we prove a result (first proven by Chung, though our proof was by Szekely) that \Omega(n^{2/3}) different distances are guaranteed.
This week we use many of the same tools to show that \Omega(n^{4/5}) different distances are guaranteed.
For my writeup AND links to every paper in this field, see http://www.cs.umd.edu/~gasarch/erdos_dist/

Finding a maximum  density subgraph

Barna Saha

ABSTRACT

Given a graph G=(V,E), we want to find an induced subgraph G'=(V',E') of G, which has the maximum density E'/V'. We will show how this problem can be effectively solved in polynomial time, by reducing it to a series of max-flow computations. We will also discuss a linear time algorithm for the same problem, which obtains a 2 approximation.

References:
A. V. Goldberg, Finding a Maximum Density Subgraph'',
Technical Report: CSD-84-171, 1984.
Moses Charikar: Greedy approximation algorithms for finding dense
components in a graph. APPROX 2000

Sum of Subtree Clustering

ABSTRACT

A hierarchical clustering of a finite set of points is a family F of subsets of the points, such that any two subsets in the family are either disjoint or related by containment, and such that the family is maximal with respect to this property. We formulate the problem of hierarchical clustering minimizing the sum of spanning tree lengths  of the clusters, for general metric spaces, show that it is NP-complete, and provide a constant factor approximation algorithm for the problem.

Reference: D. Eppstein, "Squarepants in a tree: sum of subtree clustering and hyperbolic pants decomposition", SODA '07:

Set Cover and Maximum Coverage with Group Budgets

Peter Fontana

In this talk I will present the results of: C. Chekuri and A. Kumar.  "Maximum Coverage Problem with Group Budget Constraints and Applications." "Proceedings of APPROX", Springer LNCS, 72-83, 2004 Available online at http://www.cs.uiuc.edu/homes/chekuri/pub.html
I will describe approximation algorithms for generalizations of the Set Cover Problem and the Maximum Coverage Problems.  The primary generalization I will cover is when the sets are grouped into (not-necessarily disjoint) groups with the constraint that the algorithm can only pick so many sets k_ in each group G_ (this number can be different for each group).  This generalization is called "with Group Budgets."

Slides

Embedding to Tree Metrics

Minkyoung Cho

I will present the results of: J. Fakcharoenphol, S. Rao, and K. Talwar, "A tight bound on approximating arbitrary metrics by tree metrics", STOC '03.
(http://portal.acm.org/citation.cfm?id=780608)
In the paper, they show that any n point metric space can be embedded into a distribution over dominating tree metrics such that the expected stretch of
any edge is O(log n). (Note that the result is tight since there exist metric spaces where any tree embedding must have distortion Ω(log n)-distortion.)

A proof of the Gilbert-Pollack's conjecture on the Steiner ratio

I will present Du and Hwang's proof of the Gilbert-Pollack conjecture: Given a set of points P on the plane, let SMT(P) be the cost of the Minimum Steiner Tree for P and MST(P) be the cost of the Minimum Spanning Tree for P. Then inf_P SMT(P)/MST(P) = sqrt(3)/2. For reference I will use the survey "The state of art on Steiner ratio problems" by Du and Hwang in Computing in Euclidean Geometry.

William Gasarch

The following problem is thought to be hard: Given n, return the factors of n. We present an algorithm that is easy to implement, does not require too much knowledge of Number Theory to understand, and seems to do well in practice.  It seems hard to analyze rigorouly; indeed, the runtime is still unknown. (The algorithm in this form is due to Carl Pomerance; however, parts of it were already known.)
The method is called the Quadratic Sieve. It begins with the following observation: If you want to factor 8051 you note that  8051 = 8100-49 = 90^2 - 7^2 = (90+7)(90-7). SO, given n, you need to find two squares, a^2 and b^2 such that n= a^2-b^2 = (a-b)(a+b) This seems hard, though the following is just as good: Given n, find two squares a^2 and b^2 such that a^2-b^2 \equiv 0 mod n then we know there exists a k (do not know what it is) kn= a^2-b^2 = (a-b)(a+b) so GCD(a-b,n) or GCD(a+b,n) might be a nontrivial factor of n. How to find such an a,b? Come to the talk to find out!

Covering Minimum Spanning Trees of Random Subgraphs

Thomas DuBois

Given a graph G on which nodes can fail independently, and a probability p of each node surviving we show how to find a "small" edge set that will almost always contain (or cover) the minimum spanning tree (or forrest) of the remaining graph.  We show that a set of size no more than \$O((clog n)/(log 1/(1-p))) will contain the MST with probability at least 1-n^{-c}.  Furthermore we show how to find a set that is only slightly larger but gives the same covering guarantee in poly time.Original work by Michel X. Goemans and Jan Vondrak available from http://www-math.mit.edu/~goemans/mst_rsa_final.pdf

A Learning Theory Approach to Non-Interactive Database Privacy

Aaron Roth

Consider the problem of privacy-preserving data release: How can we  release useful information about a database while preserving the privacy of those whose information is contained within the database?  Previous work has shown that there are classes of queries that cannot  be answered usefully without allowing an adversary to reconstruct  almost the entire database, and so much work on privacy has focused on  interactive mechanisms which can only answer a limited number of  queries. Motivated by learning theory, we revisit non-interactive  database privacy, and circumvent existing lower bounds by considering  usefulness with respect to a concept class. We show that it is  possible to release a dataset that is useful with respect to any  concept class of polynomial VC-dimension, while simultaneously preserving privacy. Our proof, however, does not yield an efficient  algorithm. We give efficient algorithms that privately release  information useful for the class of axis-aligned rectangles of constant dimension, and for halfspaces (according to a relaxed  definition of usefulness). The general question of computational  efficiency is left open.
This is joint work with Avrim Blum and Katrina Ligett, and will appear
in STOC 2008.

Tree Decomposable Models for Efficient Bioinformatics Algorithms

Chunmei Liu

Many bioinformatics problems are computationally intensive yet often require accurate and fast solutions in applications. Such problems present a great  challenge in computation; In this presentation, a tree decomposable graph model is introduced which allows effective modeling of biological systems and efficient
algorithms for bioinformatics problems. In particular, many small parameters carried by these bioinformatics problems can be transformed into small tree width of the graph model. Upon a tree decomposition of the graph model, very fast (e.g., linear-time) dynamic programming algorithms can be achieved.  Throughout this presentation, the example of structure-sequence alignment will be used to illustrate the biopolymer structure modeling and the tree decomposition based algorithm development. Structure-sequence alignment is a critical task in solving a number of structural bioinformatics problems, including non-coding RNA gene finding and protein tertiary structure prediction.
-----------
Dr. Chunmei Liu is Assistant Professor in the Department of Systems and Computer Science at Howard University. She received her Ph.D. in 2006 from the University of Georgia. Her research interests include Algorithms, Graph Theory, and Computational Biology.

Proofs that Really Count: The Art of Combinatorial Proof

We will present elegant proofs of several identities on numbers that arise frequently in Mathematics such as Binomial Coefficients and Fibonacci Numbers. The proofs use elementary counting techniques as opposed to more traditional methods of proof.

Complete Fairness in Secure Two-Party Computation

Dov Gordon

In the setting of secure two-party computation, two mutually distrusting parties wish to compute some function of their inputs while preserving, to the extent possible, various security properties such as privacy, correctness, and more. One desirable property is \emph{fairness}, which guarantees that if either party receives its output, then the other party does too. Cleve (STOC~1986) showed that complete fairness cannot be achieved \emph{in general} in the two-party setting; specifically, he showed (essentially) that it is impossible to compute Boolean XOR with complete fairness. Since his work, the accepted folklore has been that \emph{nothing} non-trivial can be computed with complete fairness, and the question of complete fairness in secure two-party computation has been treated as closed since the late '80s. In this paper, we demonstrate that this widely held folklore belief is \emph{false} by showing completely-fair secure protocols for various non-trivial two-party functions including Boolean AND/OR as well as Yao's millionaires' problem''. Surprisingly, we show that it is even possible to construct completely-fair protocols for certain functions containing an embedded XOR'', although in this case we also prove a lower bound showing that a super-logarithmic number of rounds are necessary. Our results demonstrate that the question of completely-fair secure computation without an honest majority is far from closed.

http://www.cs.umd.edu/areas/Theory/ / Webmaster: Azarakhsh Malekian