Text Matching With Monte Carlo Tree Search

INFORMATION RETRIEVAL, CCIR 2018(2018)

引用 0|浏览23
暂无评分
摘要
In this paper we address a novel reinforcement learning based model for text matching, referred to as MM-Match. Inspired by the success and methodology of the AlphaGo Zero, MM-Match formalizes the problem of text matching with a Monte Carlo tree search (MCTS) enhanced Markov decision process (MDP) model, where the time steps corresponded to the positions in a match matrix from top left corner to down right corner, and each action corresponds to a movement in a direction. Two long short-term memory networks (LSTM) are used to summarize the words in current path and one more LSTM is used to summarize future words. Based on the outputs of LSTMs, the policy guides the move direction and the value predicts the correctness of the whole sentence are produced. The policy and value are then strengthened with MCTS, which takes the produced raw policy and value as inputs, simulates and evaluates the possible direction assignments at the subsequent positions, and outputs a better search policy for assigning directions. A reinforcement learning algorithm is proposed to train the model parameters. Our work innovatively applies an MDP model to the text matching task. MM-Match can accurately predict the directions thanks to the exploratory decision making mechanism introduced by MCTS. Experimental results showed that MM-Match performs similar to the classical text matching models including MatchPyramid and MatchSRNN.
更多
查看译文
关键词
Monte Carlo tree search, Markov decision process, Text matching
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要