The IBM Research|NYU|Columbia Theory Day
                         Friday, May 7, 2010


The Theory Day will be held at the Davis auditorium,
412 Schapiro (CEPSR) building, Columbia University, New York.

                                  Program

9:30   - 10:00    Coffee and bagels

10:00  - 10:55    Dr. Shai Halevi (IBM Research)
                  On Homomorphic Encryption and Secure Computation

10:55  - 11:05    Short break

11:05  - 12:00    Prof. Claire Mathieu (Brown University)
                  Recognizing Well-Parenthesized Expressions in the
Streaming Model

12:00  -  2:00    Lunch break

 2:00  -  2:55    Dr. Alex Andoni (Princeton)
                  Polylogarithmic Approximation to Edit Distance
                  (or, the Asymmetric Query Complexity)

 2:55  -  3:15    Coffee break

 3:15  -  4:10    Dr. John Langford (Yahoo Research)
                  The Foundations of Learning from Exploration Data

For directions, please see
http://www.cs.columbia.edu/theory/directions.html

To subscribe to our mailing list, follow instructions at
http://www.cs.nyu.edu/mailman/listinfo/theory-ny

Organizers:
Yevgeniy Dodis   dodis@cs.nyu.edu
Cliff Stein      stein.cliff@gmail.com
Tal Rabin        talr@us.ibm.com
Baruch Schieber  sbar@us.ibm.com

==================================================================

                          Abstracts

         On Homomorphic Encryption and Secure Computation

                      Dr.  Shai Halevi
                       (IBM Research)

A homomorphic encryption scheme is one that allows computing on
encrypted data, such that the result of the computation can still be
decrypted. I will talk about recent developments in constructing (fully)
homomorphic encryption schemes, and relations with protocols for secure
function evaluation. Specifically, I will:

* Describe the approach that underlies Gentry's recent homomorphic
encryption scheme and all its variants, and illustrate a few different
instantiations of this approach.

* Survey some easy (but often ignored) connections between homomorphic
encryption schemes and protocols for two-party secure function evaluation.

* Show how to extend homomorphic encryption schemes to a "multi hop"
setting: In this setting, several parties sequentially compute on the
same encrypted data, and we want to allow each party to use not only the
original encrypted data but also the results of prior computations.

==================================================================

  Recognizing Well-Parenthesized Expressions in the Streaming Model

                        Dr. Claire Mathieu
                          (Brown University)

Motivated by a concrete problem and with the goal of understanding the sense in
which the complexity of streaming algorithms is related to the complexity of
formal languages, we investigate the problem Dyck(s) of checking matching
parentheses, with s different types of parenthesis.  We present a one-pass
randomized streaming algorithm for Dyck(2) with space O( sqrt{n} log n),
time per letter polylog(n), and one-sided error. We prove that
this one-pass algorithm is optimal, up to a polylog(n) factor, even
when two-sided error is allowed. For the lower bound, we prove a direct sum
result on hard instances by following the "information cost" approach, but
with a few twists.  Indeed, we play a subtle game between public and private
coins. This mixture between public and private coins results from a balancing
act between the direct sum result and a combinatorial lower bound for the base
case.  Surprisingly, the space requirement shrinks drastically if we have
access to the input stream in reverse. We present a two-pass randomized
streaming algorithm for Dyck(2) with space O((log n)^2), time polylog(n)
and one-sided error, where the second pass is in the reverse
direction. Both algorithms can be extended to Dyck(s) since this problem is
reducible to Dyck(2) for a suitable notion of reduction in the streaming
model.

This is joint work with Frederic Magniez and Ashwin Nayak.

==================================================================

         Polylogarithmic Approximation to Edit Distance
              (or, the Asymmetric Query Complexity)

                       Dr. Alex Andoni
                         (Princeton)

We present a near-linear time algorithm that approximates the edit distance
between two strings within a polylogarithmic factor. This is an
exponential improvement over the previously known bounds.

Our new algorithm emerges from the investigation of the edit distance
within a new framework, namely a model of asymmetric queries. Within
this framework, we are able to maintain a parallel view of the
upper and lower bounds, leading to near-tight query complexity bounds.
Our investigation also yields the first
rigorous separation between the edit distance and the Ulam distance
(edit distance on permutations), thus uncovering further hardness
phenomena inherent to the edit distance,
beyond what the previous analyses have revealed.

I will talk about some arising open questions.

Joint work with Robert Krauthgamer and Krzysztof Onak.

==================================================================

             The Foundations of Learning from Exploration Data

                       Dr. John Langford
                         (Yahoo Research)

It is very natural to wish to apply machine learning to the
large amounts of data generated by user interactions, but the process
turns out to be delicate.  For example, if a news story snippet is shown
to a user, and the user clicks on (and reads) it, this event is
fundamentally _not_ equivalent to a multiclass label for several
reasons.  For example, the user might easily have been interested in
other (unseen) stories as well.  To deal with these issues, we propose
using the contextual bandit setting, where on each round information is
used to choose an action, and feedback about just this action is
observed.  This setting has the great virtue that it's tractable, with
algorithms enjoying regret and sample complexity guarantees entirely
comparable to what's possible in a more familiar supervised learning
setting.