Modout: Learning to Fuse Face and Gesture Modalities with Stochastic Regularization

international conference on automatic face and gesture recognition(2017)

引用 26|浏览83
暂无评分
摘要
Model selection methods based stochastic regularization such as Dropout have been widely used in learning due to their simplicity and effectiveness. The standard Dropout method treats all units, visible or hidden, in same way, thus ignoring any emph{a priori} information related to grouping or structure. Such structure is present in multi-modal learning applications such as affect analysis and gesture recognition, where subsets of units may correspond to individual modalities. In this paper we describe Modout, a model selection method based on stochastic regularization, is particularly useful in the multi-modal setting. Different from previous methods, it is capable of learning whether or when to fuse two modalities in a layer, which is usually considered to be an architectural hyper-parameter by deep learning researchers and practitioners. Modout is evaluated one synthetic and two real multi-modal datasets. The results indicate improved performance compared to other stochastic regularization methods. The result Montalbano dataset shows that learning a fusion structure by Modout is par with a state-of-the-art carefully designed architecture.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要