Improving RNN Performance by Modelling Informative Missingness with Combined Indicators

APPLIED SCIENCES-BASEL(2019)

引用 5|浏览14
暂无评分
摘要
Daily questionnaires from mobile applications allow large amounts of data to be collected with relative ease. However, these data almost always suffer from missing data, be it due to unanswered questions, or simply skipping the survey some days. These missing data need to be addressed before the data can be used for inferential or predictive purposes. Several strategies for dealing with missing data are available, but most are prohibitively computationally intensive for larger models, such as a recurrent neural network (RNN). Perhaps even more important, few methods allow for data that are missing not at random (MNAR). Hence, we propose a simple strategy for dealing with missing data in longitudinal surveys from mobile applications, using a long-term-short-term-memory (LSTM) network with a count of the missing values in each survey entry and a lagged response variable included in the input. We then propose additional simplifications for padding the days a user has skipped the survey entirely. Finally, we compare our strategy with previously suggested methods on a large daily survey with data that are MNAR and conclude that our method worked best, both in terms of prediction accuracy and computational cost.
更多
查看译文
关键词
missing data,recurrent neural networks,predictive models,survey data,mobile health
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要