Combinatorics and Algorithms for Real Problems


List of Projects

Quantum Computing: Practical Synthesis of Quantum Signal Processing Circuits

Andrew Childs

Prerequisite: Linear Algebra, Complex Analysis, Software Skills. No background in Quantum Mechanics is required.

The quantum signal processing framework of Low and Chuang leads to quantum algorithms for simulating quantum dynamics that are both theoretically optimal and fast in practice. However, specifying the parameters of these algorithms requires a classical computation that suffers from numerical instability, making it difficult to synthesize the required quantum circuits. A recent reformulation of Haah (see below for pointer to the paper) gives an alternative approach to this challenge that appears to be both simpler and more numerically tractable. The proposed project entails implementing Haah's algorithm in software, with the goal of synthesizing circuits for simulations that are large enough to be classically hard. If time permits, the project can also investigate potential applications of Haah's approach to other quantum algorithms.

Paper by Haah

Reconstructing Private Data via Side-Channel Attacks

Dana Dachman-Soled

Prerequisite: C programming, Gnu debugger (gdb), Cache architectures. Helpful but not required: experience analyzing signals.

When two processes run on a single CPU, they share hardware resources such as a cache. A spy process can track cache usage to reveal information about the victim process. This tracking is done via, e.g., a spy virtual machine (VM) sharing physical resources with a victim VM in the cloud. Prior work on exploiting cache side-channels was mainly limited to recovering secret keys of specific cryptographic algorithms. This project, in contrast, takes a global view and focuses on recovering general data during typical data processing, thus exposing vulnerabilities across a wider range of practical settings. We shall develop attacks that (1) monitor a shared cache to learn information about the access patterns of data-processing software; (2) use this information to reconstruct private data; and (3) develop countermeasures against such attacks. We will launch these attacks on open source data processing software.

Diversity in Matching Markets

John Dickerson

Prerequisites: Basic understanding of supervised machine learning, some experience with a programming language

How should a firm allocate its limited interviewing resources to select the optimal cohort of new employees from a large set of job applicants? How should that firm allocate cheap but noisy resume screenings and expensive but in-depth in-person interviews? We view this problem through the lens of combinatorial pure exploration (CPE) in the multi-armed bandit setting, where a central learning agent performs costly exploration of a set of arms before selecting a final subset with some combinatorial structure. In our case, that combinatorial structure will incorporate elements of diversity, where including multiple elements of the same type in the final set results in diminishing marginal gain. We will apply our algorithms to real data from university admissions and the hiring processes from major tech firms.

Using Generative Adversarial Networks to Generate Fake but Believable Images

Soheil Feizi

Prerequisite: Probability, Linear Algebra, Python programming preferred but not required.

Learning a probability model from data is a fundamental problem in machine learning and statistics. A classical approach to this problem is to fit (approximately) an explicit probability model to the training data via a maximum likelihood estimation. Building off the success of deep learning, however, there has been another approach to this problem using Generative Adversarial Networks (GANs). GANs view to this problem as a game between two sets of functions: a generator whose goal is to generate fake samples that are close to the real data training samples and a discriminator whose goal is to distinguish between the real and fake samples. In this project, we aim to train some of the state-of-the-art GAN architectures on image datasets and produce fake but believable images.

Unsupervised Machine Learning in High Dimensions

Furong Huang

Prerequisite: Linear Algebra, Probability. Exposure to ML preferred but not required.

Learning expressive and yet concise representation of complex high-dimensional data is the key for successful unsupervised learning, where no labeled information is provided. Although the data lies in a complex high-dimensional space, there is oftentimes an underlying low-dimensional structure that concisely describes the data generating process. Therefore the problem is to uncover the hidden structure.

We will use linear latent variable models and spectral methods such as matrix/tensor decomposition on this problem. We will learn these techniques as needed.

Muffin Mathematics: Nobody Wants a Small Piece

William Gasarch

Prerequisite: Software Skills and a sense of Fun!

You have 5 muffins to distribute to 3 students. Every students gets 53. You could divide each muffin (13,13,13) and then give every student 5 13-sized pieces. BAD! Students get pieces of size 13 which is small! And nobody wants a small piece! Can you come up with an allocation scheme where the smallest piece is BIGGER than 1/3. (Hint: you can! See the link below for the answer… sung)

General Problem: You have m muffins to distribute to s students. Every students gets 5/3. Find the way to do this that maximizes the min piece. Prove its the best you can do!

There are many techniques known for this problem, though very little has been proven. This project has several aspects:

1) write a package that uses known and new techniques that will, given m,s, output the best way to allocate them AND a proof that its the best.

2) PROVE theorems about our techniques.

3) Use ML techniques to generate more results.

WARNING: Bill Gasarch is REALLY INTO this and is the worlds leading authority on this problem. Here is the proof:

You Tube Video!

Cryptanalysis for Lattice-Based Cryptography

Jonathan Katz

Prerequisites: Mathematical maturity and programming ability in C/C++. Knowledge of cryptography is useful but not essential.

Next-generation cryptosystems may be based on newer assumptions related to the hardness of solving certain problems on lattices. This project will explore implementations of known algorithms from the literature for solving these problems in an effort to understand their concrete hardness.