I'm particularly interested in offline RL, where we have a dataset of previously collected experience, and want to train a policy solely on the dataset without further interaction with the environment. This is a fundamental building block of RL and has many real-world applications, so I think it is important problem to understand well.
Previously, I was a research scientist on the Amazon Speech team in Boston, where I designed deep neural networks acoustic models for small-footprint keyword spotting. Before joining Amazon, I was a visiting Postdoctoral Research Fellow in the Price lab at the Harvard School of Public Health. I worked on methods for genetic risk prediction and association testing in genome wide association (GWAS) studies with related individuals.