Co-GAT: A Co-Interactive Graph Attention Network for Joint Dialog Act Recognition and Sentiment Classification

Cited by: 0|Bibtex|Views423|Links
Keywords:
graph attention networkmutual interaction informationact recognitioncontextual informationbidirectional LSTMMore(6+)
Weibo:
Our framework outperforms the state-of-the-art dialog act recognition and sentiment classification models which trained in separate task in all metrics on two datasets

Abstract:

In a dialog system, dialog act recognition and sentiment classification are two correlative tasks to capture speakers intentions, where dialog act and sentiment can indicate the explicit and the implicit intentions separately. The dialog context information (contextual information) and the mutual interaction information are two key fact...More
0
Introduction
  • Dialog act recognition (DAR) and sentiment classification (SC) are two correlative tasks to correctly understand speakers’ utterances in a dialog system (Cerisara et al 2018; Lin, Xu, and Zhang 2020; Qin et al 2020a).
  • SC can detect the sentiments in utterances which can help to capture speakers’ implicit intentions.
  • There are two key factors that contribute to the dialog act recognition and sentiment prediction.
  • One is the mutual interaction information across two tasks and the other is the contextual information across utterances in a dialogue
Highlights
  • Dialog act recognition (DAR) and sentiment classification (SC) are two correlative tasks to correctly understand speakers’ utterances in a dialog system (Cerisara et al 2018; Lin, Xu, and Zhang 2020; Qin et al 2020a)
  • Following Kim and Kim (2018); Cerisara et al (2018); Qin et al (2020a), we adopt macro-average Precision, Recall and F1 for both sentiment classification and dialog act recognition on Dailydialog dataset and we adopt the average of the dialogact specific F1 scores weighted by the prevalence of each dialog act on Mastodon dataset
  • The first block of table represents the separate model for dialog act recognition task while the second block denotes the separate model for sentiment classification task
  • Our framework outperforms the state-of-the-art dialog act recognition and sentiment classification models which trained in separate task in all metrics on two datasets
  • We find that the integration of Co-Interactive Graph Attention Network (Co-graph attention network (GAT)) and Roberta/XLNet can further improve the performance, demonstrating that contributions from the two are complementary
  • On Mastodon dataset, our model gains 3.0% and 1.9% improvement on F1 score on SC and DAR task, respectively
  • We propose a co-interactive graph framework where a cross-utterances connection and a cross-tasks connection are constructed and iteratively updated with each other, achieving to simultaneously model the contextual information and mutual interaction information in a unified architecture
Methods
Results
  • Following Kim and Kim (2018); Cerisara et al (2018); Qin et al (2020a), the authors adopt macro-average Precision, Recall and F1 for both sentiment classification and dialog act recognition on Dailydialog dataset and the authors adopt the average of the dialogact specific F1 scores weighted by the prevalence of each dialog act on Mastodon dataset.

    The experimental results are shown in Table 3.
  • The authors' framework outperforms the state-of-the-art dialog act recognition and sentiment classification models which trained in separate task in all metrics on two datasets.
  • This shows that the proposed graph interaction model has incorporated the mutual interaction information between the two tasks which can be effectively utilized for promoting performance mutually.
  • The authors find that the integration of Co-GAT and Roberta/XLNet can further improve the performance, demonstrating that contributions from the two are complementary
Conclusion
  • The authors propose a co-interactive graph framework where a cross-utterances connection and a cross-tasks connection are constructed and iteratively updated with each other, achieving to simultaneously model the contextual information and mutual interaction information in a unified architecture.
  • Experiments on two datasets show the effectiveness of the proposed models and the model achieves state-of-the-art performance.
  • The authors analyze the effect of incorporating strong pre-trained model in the joint model and find that the framework is beneficial when combined with pre-trained models (BERT, Roberta, XLNet)
Summary
  • Introduction:

    Dialog act recognition (DAR) and sentiment classification (SC) are two correlative tasks to correctly understand speakers’ utterances in a dialog system (Cerisara et al 2018; Lin, Xu, and Zhang 2020; Qin et al 2020a).
  • SC can detect the sentiments in utterances which can help to capture speakers’ implicit intentions.
  • There are two key factors that contribute to the dialog act recognition and sentiment prediction.
  • One is the mutual interaction information across two tasks and the other is the contextual information across utterances in a dialogue
  • Objectives:

    Edges: Since the authors aim to model the speaker information in a dialog explicitly, vertex i and vertex j should be connected if they belong to the same speaker.
  • Methods:

    Datasets

    The authors conduct experiments on the benchmark Dailydialog (Li et al 2017) and Mastodon (Cerisara et al 2018).
  • The dataset contains 11,118 dialogues Model.
  • HEC (Kumar et al 2018) CRF-ASN (Chen et al 2018) CASA (Raheja and Tetreault 2019) DialogueRNN (Majumder et al 2019) DialogueGCN (Ghosal et al 2019) JointDAS (Cerisara et al 2018) IIIM (Kim and Kim 2018) DCR-Net + Co-Attention (Qin et al 2020a) The authors' model F1 (%) -.
  • Results:

    Following Kim and Kim (2018); Cerisara et al (2018); Qin et al (2020a), the authors adopt macro-average Precision, Recall and F1 for both sentiment classification and dialog act recognition on Dailydialog dataset and the authors adopt the average of the dialogact specific F1 scores weighted by the prevalence of each dialog act on Mastodon dataset.

    The experimental results are shown in Table 3.
  • The authors' framework outperforms the state-of-the-art dialog act recognition and sentiment classification models which trained in separate task in all metrics on two datasets.
  • This shows that the proposed graph interaction model has incorporated the mutual interaction information between the two tasks which can be effectively utilized for promoting performance mutually.
  • The authors find that the integration of Co-GAT and Roberta/XLNet can further improve the performance, demonstrating that contributions from the two are complementary
  • Conclusion:

    The authors propose a co-interactive graph framework where a cross-utterances connection and a cross-tasks connection are constructed and iteratively updated with each other, achieving to simultaneously model the contextual information and mutual interaction information in a unified architecture.
  • Experiments on two datasets show the effectiveness of the proposed models and the model achieves state-of-the-art performance.
  • The authors analyze the effect of incorporating strong pre-trained model in the joint model and find that the framework is beneficial when combined with pre-trained models (BERT, Roberta, XLNet)
Tables
  • Table1: Comparison of our model with baselines on Mastodon and Dailydialog datasets. SC represents Sentiment Classification and DAR represents Dialog Act Recognition. The numbers with * indicate that the improvement of our model over all baselines is statistically significant with p < 0.05 under t-test
  • Table2: Ablation study on Mastodon and Dailydialog test datasets
  • Table3: Results on the pre-trained models
Download tables as Excel
Related work
  • Dialog Act Recognition

    Kalchbrenner and Blunsom (2013) propose a hierarchical CNN to model the context information for DAR. Lee and Dernoncourt (2016) propose a model which combine the advantages of CNNs and RNNs and incorporated the previous utterance as context to classify the current for DAR. Ji, Haffari, and Eisenstein (2016) use a hybrid architecture, combining an RNN language model with a latent variable model. Furthermore, many work (Liu et al 2017; Kumar et al 2018; Chen et al 2018) explore different architectures to better incorporate the context information for DAR. Raheja and Tetreault (2019) propose the context-aware self-attention mechanism for DAR and achieve the promising performance.

    Sentiment Classification
Funding
  • This work was supported by the National Key R&D Program of China via grant 2020AAA0106501 and the National Natural Science Foundation of China (NSFC) via grant 61976072 and 61772153
Reference
  • Cerisara, C.; Jafaritazehjani, S.; Oluokun, A.; and Le, H. T. 2018. Multi-task dialog act and sentiment recognition on Mastodon. In Proceedings of the 27th International Conference on Computational Linguistics, 745–754. Santa Fe, New Mexico, USA: Association for Computational Linguistics. URL https://www.aclweb.org/anthology/C18-1063.
    Locate open access versionFindings
  • Chai, Z.; and Wan, X. 2020. Learning to Ask More: SemiAutoregressive Sequential Question Generation under DualGraph Interaction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 225–237. Online: Association for Computational Linguistics. doi:10. 18653/v1/2020.acl-main.21. URL https://www.aclweb.org/anthology/2020.acl-main.21.
    Locate open access versionFindings
  • Chen, Z.; Yang, R.; Zhao, Z.; Cai, D.; and He, X. 2018. Dialogue act recognition via crf-attentive structured network. In Proc. of SIGIR.
    Google ScholarLocate open access versionFindings
  • Conneau, A.; Schwenk, H.; Barrault, L.; and Lecun, Y. 2017. Very Deep Convolutional Networks for Text Classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, 1107–1116.
    Google ScholarLocate open access versionFindings
  • Valencia, Spain: Association for Computational Linguistics. URL https://www.aclweb.org/anthology/E17-1104.
    Findings
  • Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. of NAACL.
    Google ScholarLocate open access versionFindings
  • Ghosal, D.; Majumder, N.; Poria, S.; Chhaya, N.; and Gelbukh, A. 2019. DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP), 154–164. Hong Kong, China: Association for Computational Linguistics. doi:10.18653/v1/D19-1015. URL https://www.aclweb.org/anthology/D19-1015.
    Locate open access versionFindings
  • Hochreiter, S.; and Schmidhuber, J. 1997. Long short-term memory. Neural computation.
    Google ScholarFindings
  • Ji, Y.; Haffari, G.; and Eisenstein, J. 2016. A latent variable recurrent neural network for discourse relation language models. arXiv preprint arXiv:1603.01913.
    Findings
  • Johnson, R.; and Zhang, T. 2017. Deep Pyramid Convolutional Neural Networks for Text Categorization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 562–570.
    Google ScholarLocate open access versionFindings
  • Vancouver, Canada: Association for Computational Linguistics. doi:10.18653/v1/P17-1052. URL https://www.aclweb.org/anthology/P17-1052.
    Findings
  • Kalchbrenner, N.; and Blunsom, P. 2013. Recurrent convolutional neural networks for discourse compositionality. arXiv preprint arXiv:1306.3584.
    Findings
  • Kim, M.; and Kim, H. 2018. Integrated neural network model for identifying speech acts, predicators, and sentiments of dialogue utterances. Pattern Recognition Letters.
    Google ScholarFindings
  • Kingma, D. P.; and Ba, J. 20Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
    Findings
  • Kumar, H.; Agarwal, A.; Dasgupta, R.; and Joshi, S. 2018. Dialogue act sequence labeling using hierarchical encoder with crf. In Proc. of AAAI.
    Google ScholarLocate open access versionFindings
  • Lee, J. Y.; and Dernoncourt, F. 20Sequential short-text classification with recurrent and convolutional neural networks. arXiv preprint arXiv:1603.03827.
    Findings
  • Li, Y.; Su, H.; Shen, X.; Li, W.; Cao, Z.; and Niu, S. 20DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 986–995.
    Google ScholarLocate open access versionFindings
  • Taipei, Taiwan: Asian Federation of Natural Language Processing. URL https://www.aclweb.org/anthology/I17-1099.
    Findings
  • Lin, T.-E.; Xu, H.; and Zhang, H. 2020. Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement. In Thirty-Fourth AAAI Conference on Artificial Intelligence.
    Google ScholarLocate open access versionFindings
  • Liu, Y.; Han, K.; Tan, Z.; and Lei, Y. 2017. Using Context Information for Dialog Act Classification in DNN Framework. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2170–2178.
    Google ScholarLocate open access versionFindings
  • Copenhagen, Denmark: Association for Computational Linguistics. doi:10.18653/v1/D17-1231. URL https://www.aclweb.org/anthology/D17-1231.
    Findings
  • Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; and Stoyanov, V. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
    Findings
  • Lu, Y.-J.; and Li, C.-T. 2020. GCAN: Graph-aware CoAttention Networks for Explainable Fake News Detection on Social Media. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 505–514. Online: Association for Computational Linguistics. doi:10. 18653/v1/2020.acl-main.48. URL https://www.aclweb.org/anthology/2020.acl-main.48.
    Locate open access versionFindings
  • Majumder, N.; Poria, S.; Hazarika, D.; Mihalcea, R.; Gelbukh, A.; and Cambria, E. 2019. Dialoguernn: An attentive rnn for emotion detection in conversations. In Proceedings of the AAAI Conference on Artificial Intelligence.
    Google ScholarLocate open access versionFindings
  • Qin, L.; Che, W.; Li, Y.; Ni, M.; and Liu, T. 2020a. DCR-Net: A Deep Co-Interactive Relation Network for Joint Dialog Act Recognition and Sentiment Classification. In Proceedings of the AAAI Conference on Artificial Intelligence.
    Google ScholarLocate open access versionFindings
  • Qin, L.; Che, W.; Li, Y.; Wen, H.; and Liu, T. 2019. A StackPropagation Framework with Token-Level Intent Detection for Spoken Language Understanding. In Proc. of EMNLP.
    Google ScholarLocate open access versionFindings
  • Qin, L.; Ni, M.; Zhang, Y.; Che, W.; Li, Y.; and Liu, T. 2020b. Multi-Domain Spoken Language Understanding Using Domain-and Task-Aware Parameterization. arXiv preprint arXiv:2004.14871.
    Findings
  • Qin, L.; Xu, X.; Che, W.; and Liu, T. 2020c. AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent Detection and Slot Filling. In Findings of the Association for Computational Linguistics: EMNLP 2020, 1807–1816. Online: Association for Computational Linguistics. doi:10.18653/v1/2020.findings-emnlp.163. URL https://www.aclweb.org/anthology/2020.findings-emnlp.163.
    Locate open access versionFindings
  • Raheja, V.; and Tetreault, J. 2019. Dialogue Act Classification with Context-Aware Self-Attention. In Proc. of NAACL.
    Google ScholarLocate open access versionFindings
  • Scarselli, F.; Gori, M.; Tsoi, A. C.; Hagenbuchner, M.; and Monfardini, G. 2009. The graph neural network model. IEEE Transactions on Neural Networks 20(1): 61–80.
    Google ScholarLocate open access versionFindings
  • Shi, Y.; Yao, K.; Tian, L.; and Jiang, D. 2016. Deep LSTM based Feature Mapping for Query Classification. In Proc. of NAACL.
    Google ScholarLocate open access versionFindings
  • Tang, D.; Qin, B.; and Liu, T. 2015. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification. In Proc. of ACL.
    Google ScholarLocate open access versionFindings
  • Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L. u.; and Polosukhin, I. 2017. Attention is All you Need. In Proc. of NIPS. Curran Associates, Inc.
    Google ScholarLocate open access versionFindings
  • Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; and Bengio, Y. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903.
    Findings
  • Wang, B. 2018. Disconnected Recurrent Neural Networks for Text Categorization. In Proc. of ACL.
    Google ScholarLocate open access versionFindings
  • Xiao, Y.; and Cho, K. 2016. Efficient character-level document classification by combining convolution and recurrent layers. arXiv preprint arXiv:1602.00367.
    Findings
  • Xu, J.; Chen, D.; Qiu, X.; and Huang, X. 2016. Cached Long Short-Term Memory Neural Networks for Document-Level Sentiment Classification. In Proc. of EMNLP.
    Google ScholarLocate open access versionFindings
  • Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R. R.; and Le, Q. V. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. In Advances in neural information processing systems, 5753–5763.
    Google ScholarLocate open access versionFindings
  • Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.; and Hovy, E. 2016. Hierarchical Attention Networks for Document Classification. In Proc. of NAACL.
    Google ScholarLocate open access versionFindings
  • Zhang, X.; Zhao, J.; and LeCun, Y. 2015. Character-level convolutional networks for text classification. In Proc. of NIPS, 649–657.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments