Combinatorics and Algorithms for Real Problems

List of Projects

Quantum Computing

Andrew Childs

(NOTE- Previous exposure to quantum mechanics is not required, but students should have a strong background in linear algebra. Additional mathematical background for some of the projects is indicated in their descriptions.)

Quantum bits, or qubits, can store information in a linear combination of basis states corresponding to classical bit strings. By representing and manipulating information using quantum states, some information-processing tasks can be accomplished much more efficiently than with classical information. Several possible projects in this area are available with researchers in the Joint Center for Quantum Information and Computer Science (QuICS, These projects (linked to below) will explore questions related to extracting information from quantum states, the relationship between a multipartite quantum state and its subsystems, the number of qubits that must be exchanged to perform a distributed computation, and protocols for compressing quantum messages when the receiver has partial information about the message.

Quantum Projects. Click on it! You know you want to!

Cryptography: Security versus Efficiency

Dana Dachman-Soled Uzi Vishkin

Several developments in computer systems during the last several decades have caused performance programmers to increasingly work closely with system architecture features to improve efficiency in critical applications. But, is there an inherent tension between programming for performance and security? In this project, we first identify certain facets of this tension and then explore some of them. We will then consider feasibility of turning this tension into key-recovery attacks on commonly used cryptosystems such as AES and RSA.

The Mathematics Needed for Kidney Exchange Programs

John Dickerson

Consider the following scenario:

Alice needs a kidney and Bob is happy to give her his. But Bob's is incompatible! Boo!

Carol needs a kidney and Dan is happy to give her his. But Dan's is incompatible! Boo!

But if Bob's kidney is fine for Carol and Dan's kidney is fine for Alice. Then they could do a kidney exchange where Bob's kidney goes to Carol and Dan's kidney goes to Alice.

This is a simple scenario, perhaps too simple. But imagine if there is a database of people who need kidney's and people who are willing to be donors (there is!). Then one can imagine exchanges like the one above. They may even be far more complex (the longest ever involved a cycle of length 16).

Several mathematical issues arise from making this work. How to tell how good a kidney is? How to balance the many factors of compatability and the difficulty of transporting kidneys?

See this short document for more details.

Machines that Learn Languages from Interactions

Hal Daume III

Natural language processing systems currently learn mostly from labeled data for a specific task-domain pairs. While this approach is useful, it cannot scale very well. Systems need to be able to learn to improve their performance from naturalistic interaction with users in addition to labeled data. This offers systems the opportunity to test proposed generalizations and receive feedback on their performance. Ideally, this feedback should be acquired naturally, or at a very low cost, through interaction with users.

For instance, machine translation systems trained on large quantities of government documents do very well at translating more government documents, but quite poorly at translating anything else. However, a translation system that is deployed to facilitate conversations between users feedback. The system must balance testing potential generalizations (exploration) with satisfying immediate user needs (exploitation), and must be able to get the most out of user feedback.

This general idea can be explored in many natural language processing applications, eventually leading to fully interactive systems that learn continually throughout their lifetime.

Database Query Processing: Evaluating Probabilistic Formulas Quickly

William Gasarch

We are given a Boolean formula over n Boolean variables (e.g., (x_1 OR x_2) AND x_3). For each variable x_i, we are given a cost c_i to check whether the variable is true or false, and a probability p_i that it is true. We assume that p_i's are independent of each other. The goal is to find the optimal order in which to evaluate the variables to minimize the expected cost to decide whether the function is true or false. Simple cases of this problem are easily solvable, whereas the most general cases are NP-Hard. However, several natural variations of the problem have unknown complexity. The goal of this project would be to make progress on the open variations of this problem. The problem naturally appears in database query processing, sequential testing, and several other practical settings.

Some papers on this topic:


survey of exact algorithms.

Evaluation unweighted linear threshold functions.

Learning With Attribute Costs.

Eval Monoton DNF Formulas.

Approx Algorithms.

Using Game Theory on the Quiz show Jeapordy

William Gasarch

In the final round of the quiz show Jeapardy Alice has a dollars, Bob has b dollars, Carol has c dollars, and they each need to bet some amount on the last question, knowing just the category. What if all players know all players prob of getting the question right? How much should they bet? Variants can also be asked. This is just one of several projects involving probability that we can work on. And they are all fun!

Phase Retrieval in Image Processing

Thomas Goldstein

There are many applications where computers must create images from data. In some cases this amounts to solving simple linear systems (of the type you've seen in linear algebra). But in other cases this involves solving large non-linear systems of equations (which is a hard problem). This is the Phase Retrievel Problem.

Recently, several new approaches to phase retrieval have emerged, some of which have strong theoretical guarantees. However, despite the recent explosion of work in this area, surprisingly little has been done to build efficient, practical implementations of these new methods, and benchmark them against one another.

In this research project, students will learn about phase retrieval problems and the new tools that have emerged to solve them. Students will then implement several new methods and benchmark them against one another using a variety of real and synthetic datasets. If time permits, students will learn to use MPI for distributed computing, and this tool will be used to solve extremely large problem instances in the cloud. This problem has natural applications in data storage and electron microscopy.

Students need to know Linear Algebra. And have good programming skills.

Training Neural Nets to play The Can't Stop Game

Clyde Kruskal

The game Can't Stop depends on a combination of luck and skill. In this game a player rolls dice to get ahead, but can stop. If he doesn't stop in time he may lose all that he has gained that turn. For more on the game, including the rules, see here.

In this project we will train a neural net to play the game and play it well! Probably better than you!

Web Accessibility