My research is centered around the interplay between theories and systems related to data platforms. I pursue fast, cheap, scalable and practical solutions with theoretical guarantee. My recent work can be characterized under the emerging theme of SysML, such as:

Fast and cheap AutoML for large-scale training data
Fast and accurate learned cardinality estimator
Fast, compact and updatable learned index
Fast and cheap network embedding for billion-edge graphs