At DeepMind I work on robotic manipulation tasks from vision with a data-driven approach. We use human annotations to learn reward functions and we use off-policy reinforcement learning to train strategies from historical data. During my PhD I worked on active learning (AL) for different classification tasks. Given a pool of unlabelled data, the goal of AL is to select which data should be annotated in order to learn the model as quickly as possible. Many AL strategies could be proposed, without any of them clearly outperforming others. It led me to meta-learning approach to active learning, where a strategy is learnt from an ensemble of multiple previous AL problems.