Sign-up: Please use
this sign-up sheet (tab: "sign-up for projects") to help me
keep track of the projects. Please update it as appropriate,
including your names/emails, brief project description (or the
current/tentative version thereof), and project status. Also,
use this sheet to to find partners and to be aware of each
other's topics and progress.
Deliverables & due dates
- Discussion with me regarding the possible project
topics, in person and/or via email.
See below for advice
on choosing a topic and
- Project proposal: a short document explaining the
topic and the proposed approach (due
Oct 31; but the sooner the
- Final report: a short (3-7 pp long) academic-style
paper describing the background and what you've done for the
project. (Soft deadline: Dec 5).
- Final presentation: A short (~15 mins) presentation
during the last classes (Dec 12).
You are encouraged to start the discussion early (and several
of you have talked with me already).
Choosing a topic. The project topic can be
anything directly related to bandits and/or experts. (If you are
unsure whether a given topic is sufficiently related, please
email/talk to me.) Generally, there are three types of projects:
- Reading (literature review): read up on a given
sub-topic. What is are the problem formulations and
how they are motivated? What are the main results and the
crucual techniques to obtain these results? What are the
open questions? What are the papers in this line of work,
and who did what?
- Experiments: pick a sub-topic and several
algorithms for this sub-topic, run experiments with these
algorithms to see which ones are better (for which
"regimes"). You can use simulated data and/or some of the
- Original research: pick an open problem and try
to make a progress on it. Could be theoretical or
experimental (or both), up to you. Such a project is likely
to involve some amount of reading, to figure out what is
known and what is likely to help. It is great if by the end
of the class you make a substantial progress towards a
future publication, but it is totally OK to only have
preliminary results, or no results at all. (After all, good
research is rarely done in a few weeks.)
These three types are not mutually exclusive: an original
research may involve much reading; and reading-
and experimets- type projects may be driven by a
specific research question, and/or may help towards research
projects in the future.
You are very encouraged to work in pairs: this is more
productive, more fun and also a useful collaborative experience
(and less projects for me). Triples are fine, too.
If you are working on a course-related research project with
some collaborators who are not attending this course, you are
encouraged to submit an account of this research project as the
"class project". But please ask me first.
For a reading project,
mention a few papers to start with. For simulations, mention a
few algorithms, a few types of problem instances you are
planning to consider (if doing simulations), and the data
sources you are planning to use (if working with real data). For
an open research project, describe the open problem and how you
plan/hope to go about solving it. While a project may evolve and morph compared to the
proposal, a good proposal usually helps to organize your
It is OK if an original research
project evolves into a reading and/or experiments project and
vice versa. However, please keep me posted. NB:
Sometimes a bigh chunk of the overall work for the project
happens before (and while)
you write the project
proposal. So, don't be alarmed if generating a project proposal
takes much work, this probably means that you'd have less work
to do later.
A page of text should usually suffice, possibly
less. Please use LaTeX, and insert bibliography as appropriate.
I will need a PDF, sent to email@example.com
, with a subject like "[CMSC 858G]
project proposal". Topic suggestions.
[I may modify or elaborate these suggestions over time... ]
- [Reading: bandits and X] A reading project is a
safer bet. One good "template" for a project topic is
"bandits and X", where X is some other topic that you like.
For example, X could be any one of the following: convex
optimization, submodular functions, metric spaces and
graphs, Gaussian processes, MDPs, supervised learning,
stochastic packing problems, crowdsourcing markets, economic
incentives, medical trials. Drop me a line if you are
interested in any of the above (or something else), and I'll
be happy to point you to some papers.
- [IID bandits with initial observations: experiments,
possibly also research]
In IID bandits, some
observations of the arms may be available before the
algorithm starts. What is the best way to incorporate this
"initial info"? For example, one can intepret this info as a
prior (and there are multiple ways to do this!), and run a
Bayesian bandit algorithm (Thompson Sampling or Gittins
Algorithm or smth else). Alternatively, one can run any
"non-Bayesian" algorithm such as UCB or Successive
Elimination and just add the initial samples to the history.
And there may be other ways to do it, too.
- ["Optimism under uncertainty": reading and/or
The technique in UCB1 has been used in
many bandit-like problems. Is it possible to formulate a
general setting/algorithm/proof which would cover many of
these applications? There are some partial results, but no
unified inderstanding, as far as I am aware.
- [A/B testing vs. Explore-First: reading]
testing is a standard technique for randomized
experiments. For IID rewards, it is (essentially) the same
as Explore-First bandit algorithm. So, in which ways does
A/B testing goes beyond Explore-First?
- [Thompson Sampling: reading, possibly also research]
What is known about [the theory of] Thompson Sampling
for bandit-like settings? Where does it fall short compared
to other approaches? You may also try to extend the
techniques from prior work to analyze Thompson Sampling for
some setting(s) not covered in the prior work. (I'll give
specific pointers if you are interested.)
- [Contextual bandits in practice: experimental]
Use state-of-art contextual bandit algorithms on real
data sets. In particular, learn to use contextual bandit
algorithms in Vowpal Wabbit.
- [Adversarial bandits: experimental, possibly reading
Say one would like to run
simulations with several algorithms for adversarial bandits,
to see which of the algorithms works better. Which "regimes"
(types of problem instances) should one consider? Ideally,
we'd have a well-established "library" of problem instances,
so that we could run a new algorithm against this library
and quickly see where it does well and where it does not.
- ["semi-adversarial bandits": reading, possibly also
experiments and/or research]
"super-optimistic" model of IID bandits and
"super-pessimistic" model of adversarial bandits, there is a
spectrum of models in which the rewards can change over
time, but in some limited way. One would like to design
algorithms that start out not knowing how "nice" the problem
instance is, perform well in the more pessimistic models,
and much better if the problem instance is actually "nice".
See my papers in COLT'12 and ICML'14 for an example.
- [Dynamic pricing with limited supply, etc.:
- [slowly changing environments] Dynamic
pricing with limited supply is (mainly) studied for IID
environments, e.g. see my paper in EC'12. Extend it to
slowly changing environment, e.g. as in my papers in
COLT'08 and COLT'11.
- [discretization error] The most general
results proceed by "discretizing" the price space and
then running a K-armed bandit algorithm for limited
supply. However, discretization error is well-understood
only when there is (at most) one product to sell and no
contexts. (See my papers in EC'12, FOCS'13 and COLT'14.)
How to go beyond that?
- A trivial tweak of the basic UCB1 algorithm happens
to "work" in a one specific setting with many globlal
Can same/similar tweak work in other settings? What are
the limitations of this simple approach?
- [Contextual bandits with partially observed
contexts: reading and/or research]
What if in each
round there is a "context", but it is only partially
observed by the algorithm? For example: what if the context
corresponds to a user profile, and the algorithm gradually
learns more about each user over time. Another example: what
if the algorithm receives "advice" from some "experts" that
have access to some of the features of the context that are
not observed by the algorithm.
- ["Free exploration" in a recommendation system:
research, possibly reading.]
Bandit algorithms can
help optimize recommendations in a recommendation system
(think: Netflix for movies, Yelp for restaurants, etc.)
However, some exploration happens "for free", simply because
different people try out different things. How to combine
this "free exploration" with exploration done by a bandit
Final report. The final report should
clearly and concisely describe the project: what was the chosen
topic, why it is interesting, what you've set out to accomplish,
what you've learned/understood/accomplished, and what it means
in a broader context.
The report should resemble a
research paper in terms of style and presentation. Try to make
it accessible to students in this course: explain things
clearly, and try not assume background beyond what was covered
- "Introduction" section should state what is the
topic/problem, why it is interesting, background and [some]
prior work with appropriate citations, what you've set out
to accomplish, and (on a high level) what you've
- "Preliminaries" section is a place to introduce
models and/or notation. Also, if you'll be using tools that
are somewhat "advanced" for this course (e.g., martingales,
linear duality, KL-divergence) mention them here, and
perhaps explain briefly what it is and how you'd be using
- "Body" of the paper presents all the details.
For new results: explain what the results mean, be
as explicit as possible about why they are interesting given the
prior work. Include enough detail: [for theory] precise
theorem statements, and full proofs, or at least proof sketches
that contain the main ideas, [for simulations/experiments]
detailed experimental setup, algorithms used, and what was
If you tried some approach and it didn't work out, write
about it, too! Try to explain why it didn't work and what are
the obstacles; present lessons / conclusions, should you have
For reading projects: what is the goal of your project?
Clearly state and cite the papers you've looked at. Try to state
the main results as precisely and explicitly as possible (but
also try to be succinct, and do use informal descriptions if
needed). Explain what these results mean. What are the main
techniques to achieve them. Ideally, also what are the cool open
problems in the area, and why prior work falls short in solving
Mapping out the relevant work is an imporant part of the
project. In particular, when citing a line of work, cite several
papers, not just one; ideally, the initial paper or two, and a
few of the most important follow-ups. Cite surveys whenever
applicable. DBLP is a good tool to track down correct references
(e.g., do not cite an arxiv version if there is a conference
publication), and also to get BibTeX entries.
The document should be 3-5 pp long, not
including fugures and references, in 11pt font with 1-inch
margins and single spacing. Include full names and emails for
all authors. LaTeX strongly preferred (but do talk to me if this
is a problem). Then the desired formatting can be achieved by
BibTeX is recommended for
Submit PDF file by
email to "firstname.lastname@example.org
with subj "final project report".
Each project will need to make a short
- Aim for 10-15 mins (but let me know if you'd like to
- Whiteboard or slides or none -- up to you.
- For projects with multiple people, it is up to you to
decide who presents; multiple presenters are OK, too.
Please prepare in advance -- this is your chance to tell the
class about the exciting stuff you've done and/or learned about!
Some generic advice:
- Aim to make your presentation as accessible as possible.
Explain what is the topic/problem and why it is exciting.
Transmitting the high-level bits and excitement is more
important in a short talk than going into any details.
- Try to distill crucial ideas and find simple high-level
explanations for them. Include hard details if you can state
them clearly and concisely, in a way accessible to the
class. But if you know that this or that technical statement
is not likely to be understood, then no need to say it.
- It is totally OK to “sweep things under the rug” if it
helps you transmit crucial points.
- It is totally OK to say “this is what we’ve been trying
to do, but we don’t know how to do it”.
Re slides, should you choose to make them:
- slides are useful, among other things, to remind you
what you are supposed to say!
strive to make the slides
clear and non-crowded. Write the main points, but no need to
write full sentences, or to put on the slide everything that
you intend to say.
- If you need to discuss a formula, it is often useful to
put it on a slide and point to it.
- Use large enough font size so that it is visible from
the back row.
- No need to make slides fancy. In particular, bulleted
list with plain text is totally fine!