About me

I am a research scientist at Google Deepmind working to solve artificial intelligence. My research focus is on decision making under uncertainty (a.k.a. reinforcement learning). I want to design autonomous agents that teach themselves to do well in any task. If we can do this, then we will be well on our way to general AI.

I completed my Ph.D. at Stanford University advised by Benjamin Van Roy. My thesis Deep Exploration via Randomized Value Functions won second place in the national Dantzig dissertation award. It takes some steps towards a practical RL algorithm that combines efficient generalization and exploration... and I'm still focused on making progress in this area!

Before coming to Stanford I studied maths at Oxford University and worked for J.P.Morgan as a credit derivatives strategist. I spent the summer of 2015 working for Google in Mountain View and, after a great internship in 2016 joined DeepMind full time in London. If you want to know more about what I'm thinking check out my blog.

Research Highlights

Quick links and catchy taglines

Deep Exploration via Randomized Value Functions

JMLR 2018 (Accepted)

Journal paper that embodies the best pieces of research from my PhD and other pieces of work from our group... my favorite paper!

A Tutorial on Thompson Sampling

Foundations and Trends in Machine Learning 2018

A really nice tutorial on Thompson sampling: what it is, why it works and when to use it. Includes lots of examples (+ code). Focus on building an intuition, rather than getting bogged down in theorems.

Randomized Prior Functions for Deep Reinforcement Learning

NeurIPS 2018 (Spotlight)

Add a "prior effect" to your bootstrap posterior with one simple trick: add a random function offset to each neural net in your ensemble!

Scalable Coordinated Exploration in Concurrent Reinforcement Learning

NeurIPS 2018

If you have a team of agents, rather than just one, you need to coordinate them to explore efficiently. Randomized value functions work well here... if you do it right...

Why is Posterior Sampling Better than Optimism for Reinforcement Learning?

ICML 2016 (full oral), EWRL 2016

Computational results demonstrate that PSRL dramatically outperforms UCRL2. We provide insight into the extent of this performance boost and the phenomenon that drives it.

Deep Exploration via Bootstrapped Deep Q-Networks

NeurIPS 2016

Deep exploration and deep reinforcement learning. Takes the insight from efficient exploration via randomized value functions and attains state of the art results on Atari. Includes some sweet vids.

Generalization and Exploration via Randomized Value functions

ICML 2016

You can combine efficient exploration and generalization, all without a model-based planning step. Some cool empirical results and also some theory.

Model-based Reinforcement Learning and the Eluder Dimension

NeurIPS 2014

The first general analysis of model based RL in terms of the dimensionality, rather than the cardinality, of the system. Several new state of the art results including linear systems.

Near-optimal Reinforcement Learning in Factored MDPs

NeurIPS 2014 (Spotlight), INFORMS 2014

If the environment is a structured graph (aka factored MDP), then you can exploit that to learn quickly. You can adapt UCB-style approaches for this, posterior sampling gets it for free.

(More) Efficient Reinforcement Learning via Posterior Sampling

NeurIPS 2013, RLDM 2013

You don't need to use loose UCB-style algorithms to get regret bounds for reinforcement learning. Posterior sampling is more efficient in terms of computation and data and shares similar gaurantees.


Past courses

MS&E 145 - Introduction to Financial analysis - lead instructor

Finance for engineers aimed for over 75 undergraduate juniors and seniors. Everything from the time value of money to CAPM to elementary option pricing and portfolio optimization. Practical data analysis skills taught through spreadsheets.

MS&E 338 - Reinforcement learning - assistant instructor

An advanced PhD level course aimed at graduate students looking to engage in research. I managed the class research projects and gave several lectures throughout the course.

Want more?

Hit me up at any of the links below.

Here is a copy of my CV.