Cross-lingual Alignment vs Joint Training: A Comparative Study and A Simple Unified Framework

ICLR, 2020.

Cited by: 4|Bibtex|Views108|Links
EI
Keywords:
Cross-lingual Representation
Weibo:
To further improve the state-of-art of crosslingual word embeddings, we propose a simple hybrid framework which combines the strength from both worlds and achieves significantly better performance in the bilingual lexicon induction, machine translation and named entity recognitio...

Abstract:

Learning multilingual representations of text has proven a successful method for many cross-lingual transfer learning tasks. There are two main paradigms for learning such representations: (1) alignment, which maps different independently trained monolingual representations into a shared space, and (2) joint training, which directly learn...More
0
Introduction
Highlights
  • Continuous word representations (Mikolov et al, 2013a; Pennington et al, 2014; Bojanowski et al, 2017) have become ubiquitous across a wide range of NLP tasks
  • As shown in Table 1 and Table 3, we find alignment methods significantly outperform the joint training approach by a large margin in all language pairs for both bilingual lexicon induction and named entity recognition
  • The unsupervised joint training method is superior than its alignment counterpart on the unsupervised machine translation task as demonstrated in 2(c)
  • While these results demonstrate that their relative performance is task-dependent, we conduct further analysis to reveal three limitations as discussed in Sec 2.3. Their poor performance on bilingual lexicon induction and named entity recognition tasks shows that unsupervised joint training fails to generate high-quality alignments due to the lack of a fine-grained seed dictionary as discussed in its limitation 2
  • We find unsupervised joint training to achieve extremely low scores which shows that emebddings of non-shared parts are poorly aligned, consistent with the PCA visualization shown in Figure 1
  • To further improve the state-of-art of crosslingual word embeddings, we propose a simple hybrid framework which combines the strength from both worlds and achieves significantly better performance in the bilingual lexicon induction, machine translation and named entity recognition tasks
Methods
  • Step 3: Learn a projection matrix W ∈ Rd×d based on D, resulting in a final embedding set.
  • Note that the purpose here is to directly compare with similar studies in Lample et al (2018b), and the authors follow their settings and consider two language pairs, English-French and English-German, and evaluate on the widely used WMT’14 en-fr and WMT’16 en-de benchmarks
Results
  • While these results demonstrate that their relative performance is task-dependent, the authors conduct further analysis to reveal three limitations as discussed in Sec 2.3
  • Their poor performance on BLI and NER tasks shows that unsupervised joint training fails to generate high-quality alignments due to the lack of a fine-grained seed dictionary as discussed in its limitation 2.
  • The authors find unsupervised joint training to achieve extremely low scores which shows that emebddings of non-shared parts are poorly aligned, consistent with the PCA visualization shown in Figure 1
Conclusion
  • While alignment methods have had great success, there are still some critical downsides, among which the authors stress the following points: 1.
  • While recent studies in unsupervised joint training have suggested the potential benefits of word sharing, alignment methods rely on two disjoint sets of embeddings.
  • Notably, Ormazabal et al (2019) suggests that this limitation results from the fact that the two sets of monolingual embeddings are independently trained.In this paper, the authors systematically compare the alignment and joint training methods for CLWE.
  • To further improve the state-of-art of CLWE, the authors propose a simple hybrid framework which combines the strength from both worlds and achieves significantly better performance in the BLI, MT and NER tasks.
  • An interesting direction is to find a more effective word sharing strategy
Summary
  • Introduction:

    Continuous word representations (Mikolov et al, 2013a; Pennington et al, 2014; Bojanowski et al, 2017) have become ubiquitous across a wide range of NLP tasks.
  • Methods for crosslingual word embeddings (CLWE) have proven a powerful tool for cross-lingual transfer for downstream tasks, such as text classification (Klementiev et al, 2012), dependency parsing (Ahmad et al, 2019), named entity recognition (NER) (Xie et al, 2018; Chen et al, 2019), natural language inference (Conneau et al, 2018b), language modeling (Adams et al, 2017), and machine translation (MT) (Zou et al, 2013; Lample et al, 2018a; Artetxe et al, 2018b; Lample et al, 2018b)
  • The goal of these CLWE methods is to learn embeddings in a shared vector space for two or more languages.
  • These vectorial representations should have similar values for tokens with similar meanings or syntactic properties, so they can better facilitate cross-lingual transfer
  • Methods:

    Step 3: Learn a projection matrix W ∈ Rd×d based on D, resulting in a final embedding set.
  • Note that the purpose here is to directly compare with similar studies in Lample et al (2018b), and the authors follow their settings and consider two language pairs, English-French and English-German, and evaluate on the widely used WMT’14 en-fr and WMT’16 en-de benchmarks
  • Results:

    While these results demonstrate that their relative performance is task-dependent, the authors conduct further analysis to reveal three limitations as discussed in Sec 2.3
  • Their poor performance on BLI and NER tasks shows that unsupervised joint training fails to generate high-quality alignments due to the lack of a fine-grained seed dictionary as discussed in its limitation 2.
  • The authors find unsupervised joint training to achieve extremely low scores which shows that emebddings of non-shared parts are poorly aligned, consistent with the PCA visualization shown in Figure 1
  • Conclusion:

    While alignment methods have had great success, there are still some critical downsides, among which the authors stress the following points: 1.
  • While recent studies in unsupervised joint training have suggested the potential benefits of word sharing, alignment methods rely on two disjoint sets of embeddings.
  • Notably, Ormazabal et al (2019) suggests that this limitation results from the fact that the two sets of monolingual embeddings are independently trained.In this paper, the authors systematically compare the alignment and joint training methods for CLWE.
  • To further improve the state-of-art of CLWE, the authors propose a simple hybrid framework which combines the strength from both worlds and achieves significantly better performance in the BLI, MT and NER tasks.
  • An interesting direction is to find a more effective word sharing strategy
Tables
  • Table1: Precision@1 for the BLI task on the MUSE dataset6. Within each category, unsupervised methods are listed at the top while supervised methods are at the bottom. The best result for unsupervised methods is underlined while bold signifies the overall best. “IN” refers to iterative normalization proposed in <a class="ref-link" id="cZhang_et+al_2019_a" href="#rZhang_et+al_2019_a">Zhang et al (2019</a>), “AR” refers to alignment refinement and “VR” refers to vocabulary reallocation
  • Table2: Precision@1 for the BLI task on the MUSE dataset with test pairs of same surface form removed. The best result for unsupervised methods is underlined while bold signifies the overall best
  • Table3: F1 score for the crosslingual NER task. “Adv” refers to adversarial training. ‡ denotes results that are not directly comparable due to different resources and architectures used. ∗ denotes supervised XLM model trained with MLM and TLM objectives. Its Dutch (nl) result is blank because the model is not pretrained on it. Bold signifies state-of-the-art results. We report the average of 5 runs
  • Table4: Precision@1 for the BLI task on the MUSE dataset using test set produced by vocabulary reallocation. Within each category, unsupervised methods are listed at the top while supervised methods are at the bottom. Bold signifies the overall best results. “AR” refers to alignment refinement and “VR” refers to vocabulary reallocation
Download tables as Excel
Related work
Funding
  • This research was sponsored by Defense Advanced Research Projects Agency Information Innovation Office (I2O) under the Low Resource Languages for Emergent Incidents (LORELEI) program, issued by DARPA/I2O under Contract No HR0011-15-C0114
Reference
  • Oliver Adams, Adam Makarucha, Graham Neubig, Steven Bird, and Trevor Cohn. Cross-lingual word embeddings for low-resource language modeling. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pp. 937–947, Valencia, Spain, April 2017. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/E17-1088.
    Locate open access versionFindings
  • Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Eduard Hovy, Kai-Wei Chang, and Nanyun Peng. On difficulties of cross-lingual transfer with order differences: A case study on dependency parsing. In Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL), Minneapolis, USA, June 2019. URL https://arxiv.org/abs/1811.00570.
    Findings
  • Hanan Aldarmaki and Mona Diab. Context-aware cross-lingual mapping. In Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL), Minneapolis, USA, June 2019. URL https://arxiv.org/abs/1903.03243.
    Findings
  • Waleed Ammar, George Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, and Noah A Smith. Massively multilingual word embeddings. arXiv preprint arXiv:1602.01925, 2016.
    Findings
  • Mikel Artetxe and Holger Schwenk. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Transactions of the Association for Computational Linguistics, 7:597–610, 2019.
    Google ScholarLocate open access versionFindings
  • Mikel Artetxe, Gorka Labaka, and Eneko Agirre. Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2289–2294, 2016.
    Google ScholarLocate open access versionFindings
  • Mikel Artetxe, Gorka Labaka, and Eneko Agirre. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 789–798, 2018a.
    Google ScholarLocate open access versionFindings
  • Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. Unsupervised neural machine translation. In International Conference on Learning Representations, 2018b. URL https://openreview.net/forum?id=Sy2ogebAW.
    Locate open access versionFindings
  • Mikel Artetxe, Gorka Labaka, and Eneko Agirre. Bilingual lexicon induction through unsupervised machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5002–5007, 2019.
    Google ScholarLocate open access versionFindings
  • Antonio Valerio Miceli Barone. Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders. In Proceedings of the 1st Workshop on Representation Learning for NLP, pp. 121–126, 2016.
    Google ScholarLocate open access versionFindings
  • Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146, 2017.
    Google ScholarLocate open access versionFindings
  • Hailong Cao, Tiejun Zhao, Shu ZHANG, and Yao Meng. A distribution-based model to learn bilingual word embeddings. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 1818–1827, Osaka, Japan, December 2016. The COLING 2016 Organizing Committee. URL https://www.aclweb.org/anthology/C16-1171.
    Locate open access versionFindings
  • Xilun Chen, Ahmed Hassan Awadallah, Hany Hassan, Wei Wang, and Claire Cardie. Multisource cross-lingual model transfer: Learning what to share. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3098–3112, Florence, Italy, July 2019. Association for Computational Linguistics. doi: 10.18653/v1/P19-1299. URL https://www.aclweb.org/anthology/P19-1299.
    Locate open access versionFindings
  • Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, and Herve Jegou. Word translation without parallel data. In International Conference on Learning Representations (ICLR), 2018a.
    Google ScholarLocate open access versionFindings
  • Alexis Conneau, Ruty Rinott, Guillaume Lample, Adina Williams, Samuel R. Bowman, Holger Schwenk, and Veselin Stoyanov. Xnli: Evaluating cross-lingual sentence representations. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2018b.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186, 2019.
    Google ScholarLocate open access versionFindings
  • Zi-Yi Dou, Zhi-Hao Zhou, and Shujian Huang. Unsupervised bilingual lexicon induction via latent variable models. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 621–626, 2018.
    Google ScholarLocate open access versionFindings
  • Long Duong, Hiroshi Kanayama, Tengfei Ma, Steven Bird, and Trevor Cohn. Learning crosslingual word embeddings without bilingual corpora. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1285–1295, 2016.
    Google ScholarLocate open access versionFindings
  • Chris Dyer, Victor Chahuneau, and Noah A. Smith. A simple, fast, and effective reparameterization of IBM model 2. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 644–648, Atlanta, Georgia, June 2013. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Manaal Faruqui and Chris Dyer. Improving vector space word representations using multilingual correlation. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 462–471, 2014.
    Google ScholarLocate open access versionFindings
  • Goran Glavas, Robert Litschko, Sebastian Ruder, and Ivan Vulic. How to (properly) evaluate crosslingual word embeddings: On strong baselines, comparative analyses, and some misconceptions. arXiv preprint arXiv:1902.00508, 2019.
    Findings
  • Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680, 2014.
    Google ScholarLocate open access versionFindings
  • Stephan Gouws and Anders Søgaard. Simple task-specific bilingual word embeddings. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1386–1390, 2015.
    Google ScholarLocate open access versionFindings
  • Stephan Gouws, Yoshua Bengio, and Greg Corrado. Bilbowa: Fast bilingual distributed representations without word alignments. In International Conference on Machine Learning, pp. 748–756, 2015.
    Google ScholarLocate open access versionFindings
  • Karl Moritz Hermann and Phil Blunsom. Multilingual models for compositional distributed semantics. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 58–68, 2014.
    Google ScholarLocate open access versionFindings
  • Pratik Jawanpuria, Arjun Balgovind, Anoop Kunchukuttan, and Bamdev Mishra. Learning multilingual word embeddings in latent metric space: a geometric approach. Transactions of the Association for Computational Linguistics, 7:107–120, 2019.
    Google ScholarLocate open access versionFindings
  • Armand Joulin, Piotr Bojanowski, Tomas Mikolov, Herve Jegou, and Edouard Grave. Loss in translation: Learning bilingual word mapping with a retrieval criterion. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2979–2984, Brussels, Belgium, October-November 2018a. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/D18-1330.
    Locate open access versionFindings
  • Armand Joulin, Piotr Bojanowski, Tomas Mikolov, Herve Jegou, and Edouard Grave. Loss in translation: Learning bilingual word mapping with a retrieval criterion. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2979–2984, 2018b.
    Google ScholarLocate open access versionFindings
  • Phillip Keung, Yichao Lu, and Vikas Bhardwaj. Adversarial learning with contextual embeddings for zero-resource cross-lingual classification and ner. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), November 2019. URL https://arxiv.org/abs/1909.00153.
    Findings
  • Alexandre Klementiev, Ivan Titov, and Binod Bhattarai. Inducing crosslingual distributed representations of words. In Proceedings of COLING 2012, pp. 1459–1474, Mumbai, India, December 2012. The COLING 2012 Organizing Committee. URL https://www.aclweb.org/anthology/C12-1089.
    Locate open access versionFindings
  • Tomas Kocisky, Karl Moritz Hermann, and Phil Blunsom. Learning bilingual word representations by marginalizing alignments. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 224–229, 2014.
    Google ScholarLocate open access versionFindings
  • Guillaume Lample and Alexis Conneau. Cross-lingual language model pretraining. In Proceedings of NeurIPS, 2019.
    Google ScholarLocate open access versionFindings
  • Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. Neural architectures for named entity recognition. In Kevin Knight, Ani Nenkova, and Owen Rambow (eds.), NAACL, pp. 260–270. The Association for Computational Linguistics, 2016. ISBN 978-1-941643-91-4. URL http://aclweb.org/anthology/N/N16/ N16-1030.pdf.
    Findings
  • Guillaume Lample, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. Unsupervised machine translation using monolingual corpora only. In International Conference on Learning Representations, 2018a. URL https://openreview.net/forum?id=rkYTTf-AZ.
    Locate open access versionFindings
  • Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, et al. Phrase-based & neural unsupervised machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 5039–5049, 2018b.
    Google ScholarLocate open access versionFindings
  • Thang Luong, Hieu Pham, and Christopher D Manning. Bilingual word representations with monolingual quality in mind. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, pp. 151–159, 2015.
    Google ScholarLocate open access versionFindings
  • Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. ICLR, 2013a.
    Google ScholarFindings
  • Tomas Mikolov, Quoc V Le, and Ilya Sutskever. Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168, 2013b.
    Findings
  • Aitor Ormazabal, Mikel Artetxe, Gorka Labaka, Aitor Soroa, and Eneko Agirre. Analyzing the limitations of cross-lingual word embedding mappings. arXiv preprint arXiv:1906.05407, 2019.
    Findings
  • Barun Patra, Joel Ruben Antony Moniz, Sarthak Garg, Matthew R. Gormley, and Graham Neubig. Bilingual lexicon induction with semi-supervision in non-isometric embedding spaces. In The 57th Annual Meeting of the Association for Computational Linguistics (ACL), Florence, Italy, July 2019. URL https://www.aclweb.org/anthology/P19-1018.
    Locate open access versionFindings
  • Jeffrey Pennington, Richard Socher, and Christopher Manning. Glove: Global vectors for word representation. In EMNLP, pp. 1532–1543, 2014.
    Google ScholarLocate open access versionFindings
  • Telmo Pires, Eva Schlinger, and Dan Garrette. How multilingual is multilingual bert? In The 57th Annual Meeting of the Association for Computational Linguistics (ACL), July 2019. URL https://arxiv.org/abs/1906.01502.
    Findings
  • Sebastian Ruder, Ivan Vulic, and Anders Søgaard. A survey of cross-lingual word embedding models. Journal of Artificial Intelligence Research, 65:569–631, 2019.
    Google ScholarLocate open access versionFindings
  • Tal Schuster, Ori Ram, Regina Barzilay, and Amir Globerson. Cross-lingual alignment of contextual word embeddings, with applications to zero-shot dependency parsing. In Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL), Minneapolis, USA, June 2019. URL https://arxiv.org/abs/1902.09492.
    Findings
  • Samuel L. Smith, David H. P. Turban, Steven Hamblin, and Nils Y. Hammerla. Offline bilingual word vectors, orthogonal transformations and the inverted softmax. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017. URL https://openreview.net/forum?id=r1Aab85gg.
    Locate open access versionFindings
  • Anders Søgaard, Zeljko Agic, Hector Martınez Alonso, Barbara Plank, Bernd Bohnet, and Anders Johannsen. Inverted indexing for cross-lingual nlp. In The 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2015), 2015.
    Google ScholarLocate open access versionFindings
  • Anders Søgaard, Sebastian Ruder, and Ivan Vulic. On the limitations of unsupervised bilingual dictionary induction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 778–788, Melbourne, Australia, July 2018. Association for Computational Linguistics. doi: 10.18653/v1/P18-1072. URL https://www.aclweb.org/anthology/P18-1072.
    Locate open access versionFindings
  • Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. Mass: Masked sequence to sequence pre-training for language generation. In International Conference on Machine Learning, pp. 5926–5936, 2019.
    Google ScholarLocate open access versionFindings
  • Erik F. Tjong Kim Sang. Introduction to the CoNLL-2002 shared task: Language-independent named entity recognition. In CoNLL, pp. 1–4, 2002. doi: 10.3115/1118853.1118877. URL https://doi.org/10.3115/1118853.1118877.
    Locate open access versionFindings
  • Erik F Tjong Kim Sang and Fien De Meulder. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In CoNLL, pp. 142–147, 2003.
    Google ScholarLocate open access versionFindings
  • Ivan Vulicand Marie-Francine Moens. Bilingual distributed word representations from documentaligned comparable data. Journal of Artificial Intelligence Research, 55:953–994, 2016.
    Google ScholarLocate open access versionFindings
  • Zirui Wang, Zihang Dai, Barnabas Poczos, and Jaime Carbonell. Characterizing and avoiding negative transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11293–11302, 2019.
    Google ScholarLocate open access versionFindings
  • Shijie Wu and Mark Dredze. Beto, bentz, becas: The surprising cross-lingual effectiveness of BERT. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), November 2019. URL https://arxiv.org/abs/1904.09077.
    Findings
  • Min Xiao and Yuhong Guo. Distributed word representation learning for cross-lingual dependency parsing. In Proceedings of the Eighteenth Conference on Computational Natural Language Learning, pp. 119–129, 2014.
    Google ScholarLocate open access versionFindings
  • Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A Smith, and Jaime Carbonell. Neural crosslingual named entity recognition with minimal resources. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 369–379, 2018.
    Google ScholarLocate open access versionFindings
  • Chao Xing, Dong Wang, Chao Liu, and Yiye Lin. Normalized word embedding and orthogonal transform for bilingual word translation. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1006–1011, 2015.
    Google ScholarLocate open access versionFindings
  • Ruochen Xu, Yiming Yang, Naoki Otani, and Yuexin Wu. Unsupervised cross-lingual transfer of word embedding spaces. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2465–2474, 2018.
    Google ScholarLocate open access versionFindings
  • Meng Zhang, Yang Liu, Huanbo Luan, and Maosong Sun. Adversarial training for unsupervised bilingual lexicon induction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, pp. 1959–1970, 2017a.
    Google ScholarLocate open access versionFindings
  • Meng Zhang, Yang Liu, Huanbo Luan, and Maosong Sun. Earth mover’s distance minimization for unsupervised bilingual lexicon induction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1934–1945, 2017b.
    Google ScholarLocate open access versionFindings
  • Mozhi Zhang, Keyulu Xu, Ken-ichi Kawarabayashi, Stefanie Jegelka, and Jordan Boyd-Graber. Are girls neko or sh\= ojo? cross-lingual alignment of non-isomorphic embeddings with iterative normalization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 31803189, 2019.
    Google ScholarLocate open access versionFindings
  • Chunting Zhou, Xuezhe Ma, Di Wang, and Graham Neubig. Density matching for bilingual word embedding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 1588–1598, 2019.
    Google ScholarLocate open access versionFindings
  • Will Y. Zou, Richard Socher, Daniel Cer, and Christopher D. Manning. Bilingual word embeddings for phrase-based machine translation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1393–1398, Seattle, Washington, USA, October 2013. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/D13-1141.
    Locate open access versionFindings
Your rating :
0

 

Tags
Comments