AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Our method consistently and significantly outperforms the alternative baselines in terms of p@1, mean average precision, normalized discounted cumulative gain, and Mean Reciprocal Rank

Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System.

SIGIR, pp.55-64, (2016)

Cited by: 288|Views144
EI

Abstract

To establish an automatic conversation system between humans and computers is regarded as one of the most hardcore problems in computer science, which involves interdisciplinary techniques in information retrieval, natural language processing, artificial intelligence, etc. The challenges lie in how to respond so as to maintain a relevant ...More

Code:

Data:

0
Introduction
  • To have a virtual assistant and/or chat companion system in open domains with adequate artificial intelligence has seemed illusive, and might only exist in Sci-Fi movies for a long time.
  • The goal of creating an automatic human-computer conversation system, as the personal assistant or chat companion, is no longer an illusion far away.
  • It is likely to be a great timing to build data-driven, open-domain conversation systems between humans and computers.
  • The conversational inputs are restricted and predictable; it would be easier—compared with open-domain systems—to design the logic, create the rules, prepare the data and construct the candidate replies to handle the particular task [23].
  • The underlying system design philosophy is nearly impossible to generalize to the open domain
Highlights
  • To have a virtual assistant and/or chat companion system in open domains with adequate artificial intelligence has seemed illusive, and might only exist in Sci-Fi movies for a long time
  • Our system outperforms standard and state-of-the-art baselines regarding a variety of evaluation metrics in terms of p@1, mean average precision (MAP), normalized discounted cumulative gain (nDCG) and Mean Reciprocal Rank (MRR) metrics
  • We propose to establish an automatic conversation system between humans and computers
  • Given a human-issued message as the query, our proposed system will return the corresponding responses based on a deep learning-to-respond schema
  • There are 3 major contributions in this work: 1) we propose a contextual query reformulation framework with ranking fusions for the conversation task; 2) we integrate multi-dimension of ranking evidences, i.e., queries, postings, replies and contexts; 3) we establish the deep neural network architecture featured with above strategies and components
  • Our method consistently and significantly outperforms the alternative baselines in terms of p@1, MAP, nDCG, and MRR
Methods
  • EXPERIMENTS AND EVALUATION

    the authors evaluate the model for conversation task against a series of baselines based on a huge conversation resource.
  • The authors constructed the dataset of 1,606,583 samples to train the deep neural networks, 357,018 for validation, and 11,097 for testing.
  • It is important that the dataset for learning does not overlap with the database for retrieval, so that the authors strictly comply with the machine learning regime.
  • For each training and validation sample, the authors randomly chose a reply as a negative sample.
  • The authors hired workers on a crowdsourcing platform to judge the appropriateness of 30 candidate replies retrieved for each query.
  • Each sample was judged by 7 annotators via majority voting based on the appropriateness for the response given the query and contexts: “1” denotes an appropriate response and “0” indicates an inappropriate one
Results
  • Given the ranking lists for test queries, the authors evaluated the performance in terms of the following metrics: precision@1 (p@1), mean average precision (MAP) [31, 43], and normalized discounted cumulative gain [8, 41].
  • The authors provided the top-k ranking list for the test queries using nDCG and MAP, which test the potential for a system to provide more than one appropriate responses as candidates.
  • |T | Z log(1 + i) q∈T i=1 where T indicates the testing query set, k denotes the top-k position in the ranking list, and Z is a normalization factor obtained from a perfect ranking. ri is the relevance score for the i-th candidate reply in the ranking list (i.e., 1: appropriate, 0: inappropriate).
Conclusion
  • The authors propose to establish an automatic conversation system between humans and computers.
  • There are 3 major contributions in this work: 1) the authors propose a contextual query reformulation framework with ranking fusions for the conversation task; 2) the authors integrate multi-dimension of ranking evidences, i.e., queries, postings, replies and contexts; 3) the authors establish the deep neural network architecture featured with above strategies and components.
  • The authors examine the effect of the proposed DL2R model with several baselines on a series of evaluation metrics.
  • The authors can incorporate more additional features and more conversationoriented formulations, such as dialogue acts, conversational logics, and discourse structures, etc
Tables
  • Table1: An example of the original microblog posting and the associated replies. Each posting might have more than one reply, e.g., Reply1 and Reply2. To create our database of conversation data, we separate different replies to a same post, and obtain ⟨post-reply⟩ pairs. We store two Posting-Reply pairs in the conversational dataset, i.e., ⟨P osting-Reply1 ⟩ and ⟨P ostingReply2 ⟩. User accounts are anonymized
  • Table2: Part (I) indicates a real human (denoted by A) - computer (denoted by B) conversation scenario, while Part (II) indicates our proposed task modeling and formulations. A2 is the current user-issued query. We have contexts and reformulated queries as listed. ‘ ’ is the literal concatenation action. Note that the selected response Reply1 is associated with a P osting in the conversational database shown in Table 1
  • Table3: Symbols and annotations for problem formulation
  • Table4: Data statistics. Postings and replies are all unique
  • Table5: Retrieval performance against baselines with our proposed adaption of contextual reformulation. ‘⋆’ indicates that we accept the improvement hypothesis of DL2R over the best baseline by Wilcoxon test at a significance level of 0.01. Performance of both generative methods and retrieval methods. For generative methods, they generate one response given each query. Hence the p@1 in fact refers to accuracy. Other metrics are not applicable
  • Table6: Performance evaluations of different contextual query reformulation strategies
  • Table7: Performance evaluations of different components with multi-dimension of ranking evidences
Download tables as Excel
Related work
  • 2.1 Conversation Systems

    Early work on conversation systems is generally based on rules or templates and is designed for specific domains [33, 36]. These rule-based approaches requires no data or little data for training, while instead require much manual effort to build the model, or to handcraft rules, which is usually very costly. The conversation structure and status tracking in vertical domains are more feasible to learn and infer [44]. However, the coverage of such systems are also far from satisfaction. Later, people begin to pay more attention to automatic conversation systems in open domains [31, 6].

    From specific domains to open domain, the need for a huge amount of data is increasing substantially to build a conversation system. As information retrieval techniques are developing fast, researchers obtain promising achievements in (deep) question and answering systems. In this way, an alternative approach is to build a conversation system with a knowledge base consisting of a number of question-answer pairs. Leuski et al build systems to select the most suitable response to the current message from the question-answer pairs using a statistical language model in crosslingual information retrieval [12], but have a major bottleneck of the creation of the knowledge base (i.e., question-answer pairs) [13]. Researchers propose to augment the knowledge base with question-answer pairs derived from plain texts [24, 3]. The number of resource pairs can be, to some extent, expanded, but are still relatively small while the performance is not quite stable either.
Funding
  • This work is supported by the National Basic Research Program of China (No 2014CB340505)
Reference
  • Y. Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1):1–127, 2009.
    Google ScholarLocate open access versionFindings
  • F. Bessho, T. Harada, and Y. Kuniyoshi. Dialog system using real-time crowdsourcing and Twitter large-scale corpus. In SIGDIAL, pages 227–231, 2012.
    Google ScholarLocate open access versionFindings
  • G. Cong, L. Wang, C.-Y. Lin, Y.-I. Song, and Y. Sun. Finding question-answer pairs from online forums. In SIGIR, pages 467–474.
    Google ScholarLocate open access versionFindings
  • A. Graves, A.-r. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In Proc. Acoustics, Speech and Signal Processing, pages 6645–6649, 2013.
    Google ScholarLocate open access versionFindings
  • H. He, K. Gimpel, and J. Lin. Multi-perspective sentence similarity modeling with convolutional neural networks. In EMNLP, pages 1576–1586, 2015.
    Google ScholarLocate open access versionFindings
  • R. Higashinaka, K. Imamura, T. Meguro, C. Miyazaki, N. Kobayashi, H. Sugiyama, T. Hirano, T. Makino, and Y. Matsuo. Towards an open domain conversational system fully based on natural language processing. In COLING, 2014.
    Google ScholarLocate open access versionFindings
  • B. Hu, Z. Lu, H. Li, and Q. Chen. Convolutional neural network architectures for matching natural language sentences. In NIPS, pages 2042–2050, 2014.
    Google ScholarLocate open access versionFindings
  • K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst., 20(4):422–446, 2002.
    Google ScholarLocate open access versionFindings
  • Z. Ji, Z. Lu, and H. Li. An information retrieval approach to short text conversation. CoRR, abs/1408.6988, 2014.
    Findings
  • N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188, 2014.
    Findings
  • C.-J. Lee, Q. Ai, W. B. Croft, and D. Sheldon. An optimization framework for merging multiple result lists. In CIKM ’15, pages 303–312, 2015.
    Google ScholarLocate open access versionFindings
  • A. Leuski, R. Patel, D. Traum, and B. Kennedy. Building effective question answering characters. In SIGDIAL, pages 18–27, 2009.
    Google ScholarLocate open access versionFindings
  • A. Leuski and D. Traum. NPCEditor: Creating virtual human dialogue using information retrieval techniques. AI Magazine, 32(2):42–56, 2011.
    Google ScholarLocate open access versionFindings
  • H. Li and J. Xu. Semantic matching in search. Foundations and Trends in Information Retrieval, 8:89, 2014.
    Google ScholarLocate open access versionFindings
  • J. Li, M. Galley, C. Brockett, J. Gao, and B. Dolan. A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055, 2015.
    Findings
  • X. Li, L. Mou, R. Yan, and M. Zhang. Stalematebreaker: A proactive content-introducing approach to automatic human-computer conversation. In IJCAI, 2016.
    Google ScholarLocate open access versionFindings
  • Z. Lu and H. Li. A deep architecture for matching short texts. In NIPS, pages 1367–1375, 2013.
    Google ScholarLocate open access versionFindings
  • C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008.
    Google ScholarFindings
  • T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv:1301.3781, 2013.
    Findings
  • L. Mou, G. Li, L. Zhang, T. Wang, and Z. Jin. Convolutional neural networks over tree structures for programming language processing. In AAAI, pages 1287–1292, 2016.
    Google ScholarLocate open access versionFindings
  • L. Mou, H. Peng, G. Li, Y. Xu, L. Zhang, and Z. Jin. Discriminative neural sentence modeling by tree-based convolution. In EMNLP, pages 2315–2325, 2015.
    Google ScholarLocate open access versionFindings
  • L. Mou, M. Rui, G. Li, Y. Xu, L. Zhang, R. Yan, and Z. Jin. Recognizing entailment and contradiction by tree-based convolution. arXiv preprint arXiv:1512.08422, 2015.
    Findings
  • M. Nakano, N. Miyazaki, N. Yasuda, A. Sugiyama, J.-i. Hirasawa, K. Dohsaka, and K. Aikawa. WIT: A toolkit for building robust and real-time spoken dialogue systems. In SIGDIAL, pages 150–159.
    Google ScholarLocate open access versionFindings
  • E. Nouri, R. Artstein, A. Leuski, and D. R. Traum. Augmenting conversational characters with generated question-answer pairs. In AAAI Fall Symposium: Question Generation, 2011.
    Google ScholarFindings
  • H. Palangi, L. Deng, Y. Shen, J. Gao, X. He, J. Chen, X. Song, and R. Ward. Deep sentence embedding using the long short term memory network: Analysis and application to information retrieval. arXiv preprint arXiv:1502.06922, 2015.
    Findings
  • A. Ritter, C. Cherry, and W. B. Dolan. Data-driven response generation in social media. In EMNLP, pages 583–593, 2011.
    Google ScholarLocate open access versionFindings
  • T. Rocktäschel, E. Grefenstette, K. M. Hermann, T. Kocisky, and P. Blunsom. Reasoning about entailment with neural attention. arXiv preprint arXiv:1509.06664, 2015.
    Findings
  • A. Severyn and A. Moschitti. Learning to rank short text pairs with convolutional deep neural networks. In SIGIR ’15, pages 373–382.
    Google ScholarLocate open access versionFindings
  • L. Shang, Z. Lu, and H. Li. Neural responding machine for short-text conversation. In ACL-IJCNLP, pages 1577–1586, 2015.
    Google ScholarLocate open access versionFindings
  • R. Socher, J. Pennington, E. H. Huang, A. Y. Ng, and C. D. Manning. Semi-supervised recursive autoencoders for predicting sentiment distributions. In EMNLP, pages 151–161, 2011.
    Google ScholarLocate open access versionFindings
  • H. Sugiyama, T. Meguro, R. Higashinaka, and Y. Minami. Open-domain utterance generation for conversational dialogue systems using Web-scale dependency structures. In SIGDIAL, pages 334–338, 2013.
    Google ScholarLocate open access versionFindings
  • I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104–3112, 2014.
    Google ScholarLocate open access versionFindings
  • M. A. Walker, R. Passonneau, and J. E. Boland. Quantitative and qualitative evaluation of darpa communicator spoken dialogue systems. In ACL, pages 515–522, 2001.
    Google ScholarLocate open access versionFindings
  • R. S. Wallace. The Anatomy of ALICE. Springer, 2009.
    Google ScholarFindings
  • H. Wang, Z. Lu, H. Li, and E. Chen. A dataset for research on short-text conversations. In EMNLP, pages 935–945, 2013.
    Google ScholarLocate open access versionFindings
  • J. Williams, A. Raux, D. Ramachandran, and A. Black. The dialog state tracking challenge. In SIGDIAL, pages 404–413, 2013.
    Google ScholarLocate open access versionFindings
  • Y. Xu, R. Jia, L. Mou, G. Li, Y. Chen, Y. Lu, and Z. Jin. Improved relation classification by deep recurrent neural networks with data augmentation. arXiv preprint arXiv:1601.03651, 2016.
    Findings
  • Y. Xu, L. Mou, G. Li, Y. Chen, H. Peng, and Z. Jin. Classifying relations via long short term memory networks along shortest dependency paths. In EMNLP, 2015.
    Google ScholarLocate open access versionFindings
  • R. Yan. i, poet: Automatic poetry composition through recurrent neural networks with iterative polishing schema. In IJCAI, 2016.
    Google ScholarLocate open access versionFindings
  • R. Yan, M. Lapata, and X. Li. Tweet recommendation with graph co-ranking. In ACL, pages 516–525, 2012.
    Google ScholarLocate open access versionFindings
  • R. Yan, C.-T. Li, H.-P. Hsieh, P. Hu, X. Hu, and T. He. Socialized language model smoothing via bi-directional influence propagation on social networks. In WWW ’16, pages 1395–1405, 2016.
    Google ScholarLocate open access versionFindings
  • R. Yan, X. Wan, J. Otterbacher, L. Kong, X. Li, and Y. Zhang. Evolutionary timeline summarization: A balanced optimization framework via iterative substitution. In SIGIR ’11, pages 745–754, 2011.
    Google ScholarLocate open access versionFindings
  • R. Yan, I. E. Yen, C.-T. Li, S. Zhao, and X. Hu. Tackling the achilles heel of social networks: Influence propagation based language model smoothing. In WWW ’15, pages 1318–1328, 2015.
    Google ScholarLocate open access versionFindings
  • K. Zhai and D. J. Williams. Discovering latent structure in task-oriented dialogues. In ACL, pages 36–46, 2014.
    Google ScholarLocate open access versionFindings
  • B. Zhang, J. Su, D. Xiong, Y. Lu, H. Duan, and J. Yao. Shallow convolutional neural network for implicit discourse relation recognition. In EMNLP, pages 2230–2235, 2015.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科