My research aims to build tools that help users make sense of large amounts of data. I work at the intersection of information retrieval, natural language processing, and databases, with a focus on large-scale distributed algorithms and infrastructure for data analytics.
    From 2010-2012, I spent an extended sabbatical at Twitter working on services designed to identify relevant content to users (search, recommendation, etc.) and analytics infrastructure to support data science (Hadoop tools, machine learning libraries, etc.). Follow me on Twitter here! I've also worked for Cloudera: In 2009, I was responsible for helping them build out a training and certification program; in fact, I wrote their very first certification exam!