# Combinatorics and Algorithms for Real Problems

## Algorithms and Crypto Projects

### Mentor: Laxman Dhulipala

Prerequisites Algorithms, experience with systems programming e.g., in C/C++, and interest in writing fast and highly scalable code on multicore CPUs (No need to have taken a parallel algorithms course)

Description

The Nearest Neighbor Problem is to design a data structure for a large set S of points in (say) n-dimensional space such that the following problem can be solved quickly: given a point p in n-dim space, find the point in S that is closest to p. The main concerns are that the data structure not take to much space, and that the nearest-neighbor query be fast.

We vary this problem in two ways: (1) rather than search for the nearest neighbor we will settle for an Approximate Nearest Neighbor ANN, (2) the algorithm that finds the ANN is a parallel algorithm.

One application of ANN is that n-dim datasets are becoming omnipresent due to the widespread adoption of deep learning methods which represent complex objects as high-dime vectors called vector embeddings, which can span thousands of dimensions.

The work will involve a mix of algorithm design, writing parallel code in C, and benchmarking implementations on large billion-point datasets.

### Mentors: David Mount, Auguste Gezalyan

Prerequisites Knowledge of discrete structures/introduction to proofs, algorithms, and programming skills.

Description

The field of computational geometry looks at problems in Eucildian Geometry computationally. For example, given n points in the plane, find which two are closest together. But there are other metrics that have been useful in fields such as Genetics, Probability, and Physics (see HERE). In this project we will be examining the Hilbert Metric (see HERE) from the perspective of computational geometry. We will explore questions involving this metric, such as how it behaves in the probability simplex. (see HERE). The project will include contributing to a software package to help us visualize these structures. In this project you will learn computational geometry and combine computational geometry with programming.

### Mentor: Rajeev Barua

Prerequisites Algorithms, knowledge of computer systems and systems programming in C/C, proficiency in network simulation tools and command lines, and a basic understanding of computer hardware and security. Experience with Python is a plus!

Description

Cybersecurity threats continue to evolve, outpacing the capabilities of traditional rule-based Intrusion Detection Systems (IDS). The project aims to investigate the effectiveness of IDS when subjected to advanced red team techniques, simulating stealthy cyberattacks on endpoints. Through this adversarial approach, we seek to identify and understand the shortcomings of current IDS in detecting sophisticated, low-footprint cyberattacks.

Concurrent research in this group is aiming to correlate seemingly benign events into causally related event chains. A rule-based IDS is being developed that then aims to categorize these event chains into malicious attack chains, or benign event chains.

In this project, the red team will aim to defeat the rule-based IDS, to discover its shortcomings. Once identified, we will aim to improve the rule formulations in the rule-based IDS, so that those attacks are thereafter detected, without special cases, using first principles. This process will be continuously done across a wide spectrum of possible attacks, improving the rule-based IDS more and more.

## Machine Learning and AI Projects

### Mentors: Jordan Boyd-Graber

Prerequisites Any one of the following:

1) Experience with calling Pytorch modules in Python (training is a plus), or

2) Building web apps connected to Python backend, or

3) Writing linguistically challenging / information rich examples for humans or computers

Description

New AI models are getting better at answering questions not just from text, but other modalities (image video sound) lag behind. However, these models still fall short. We are not sure where the difficulty currently lies. Here are possible answers in the form of questions:

(a) Can models not attend to the appropriate multimodal inputs?

(b) Is the training data insufficient?

(c) Is the alignment between modalities falling short?

In this project, we'll use existing algorithms on datasets and apply explanation methods to see where they fall short, construct / harvest examples that challenge the methods, and prepare a new challenge dataset for new methods to evaluate multimodal QA.

This will not just help us understand the limits of AI and help us improve systems understanding across modalities but also help people find information even when they cannot hear or see (either because of permanent or situational disability).

### Mentor: Aviva Prins

Prereqisites Algorithms, Machine Learning, Python. Helpful but not required: Game theory, Databases (SQL)

Description

In India, the majority of farmers are classified as small or marginal, and their livelihoods are vulnerable to climate risk. Giving farmers help in planning crops well (e.g., delaying planting by a couple of days) can make a huge difference to their expected income. The overall goal of this project is to build an optimization-based decision support tool that provides crop planning advice to farmers.

Farmer features are very similar to each other, so they will get similar recommendations. However, the wholesale price of produce in India is very sensitive to supply-demand imbalances. If everyone plants the same thing, no one will profit. This summer, we'll be focusing on decision support for multiple farmers.

This work is in conjunction with Kheyti a nonprofit in India. Check out the paper that came out of last year's REU here.

## Quantum Computing Projects

### Mentors: Victor Albert, Nat Tantivasadakarn

Prerequisites Linear Algebra, Quantum Mechanics, Quantum Computing, Quantum Error Correction (Two of the three Quantum Prerequisites may suffice)

Description

Error correction is what ensures that the audio in your phone calls remains sharp, your hard drives do not deteriorate too quickly, and signals can be reliably transmitted to remote satellites.

Over multiple decades, and with the explosion of the information age, an enormous variety of error-correction schemes were developed. Recently, a radically new type of error correction was introduced, one that can protect the quantum information that is stored in a quantum computer or that is communicated over a quantum network.

We created the Error Correction Zoo (see here) to categorize and to organize known classical and quantum error-correction schemes. It is inspired by other similar zoos, including the complexity zoo, the quantum algorithm zoo, the quantum protocol zoo, and the qubit zoo.

In this summer project, students will learn both classical and quantum error correction and contribute entries to the zoo's repository. The project can be scaled. On the easy end, there are tables and other readymade info to enter. On the hard end, the students can learn and write about advanced topics such as topological codes (see topological) and approximate quantum error correction (see approximate).

### TITLE: Quantum Graph Games

Prerequisites Linear Algebra, Graph Theory Basics, Quantum Information Theory Basics

Consider the following classical game with cooperating players Alice and Bob who do not communicate. There are two graphs G and H. A vertex sampler picks two random vertices of G and sends to Alice and to Bob. Alice and Bob respond with vertices and of H. If the edge status in G and in H are the same then Alice and Bob win, else they lose.

It is not hard to show that Alice and Bob can win iff there exists a homomorphism between G and H.

We consider a quantum version of this game. Which meens that Alice and Bob are allowed to share entangled quantum states and use them as part of their strategy to win the game. If the players can win these games we would call the two graphs quantum homomorphic.

Quantum homomorphisms are very well studied in the quantum graph theory literature and leads to quantum graph parameters such as quantum independence and quantum chromatic number (see CNMN+06 and PSS+16 below).

In this project we will look to study other graph properties related to or derived from graph homomorphisms in this quantum setting, such as induced subgraph isomorphism or weak homomorphisms, and try to relate them to other quantum and classic properties of graphs.

CMN+06 Peter J. Cameron, Ashley Montanaro, Michael W. Newman, Simone Severini, and Andreas Winter. On the quantum chromatic number of a graph, 2006. LINK

PSS+16 Vern I. Paulsen, Simone Severini, Daniel Stahlke, Ivan G. Todorov, and Andreas Winter. Estimating quantum chromatic numbers. Journal of Functional Analysis, 270(6):2188–2222, mar 2016. LINK

### Mentor: Murphy Yuezhen Niu

Prerequisies Undergraduate Level Quantum Computing or Quantum Mechanics

Description

Quantum computing has the potential to revolutionize various fields, including cryptography, optimization, and material science. However, the practical implementation of quantum algorithms faces significant challenges, with one of the major obstacles being the presence of time-dependent noise in quantum computers. Understanding and mitigating this noise is crucial for the development of robust and reliable quantum algorithms. This project proposal outlines a research initiative aimed at using language models to learn and analyze time-dependent noise in quantum computers, utilizing a real cloud-accessible quantum computer for experimental validation. The primary objectives of this project are as follows:

Develop a language model-based framework to analyze and learn time-dependent noise patterns in quantum computers. Gather and preprocess quantum computer data, including qubit state evolutions over time. Train the language model on this quantum data to recognize and predict time-dependent noise patterns. Validate the model's predictions through experimental data collected from a real, cloud-accessible quantum computer. Investigate potential applications for mitigating time-dependent noise in quantum algorithms based on the model's insights.