Text Pre-training
ACL, pp.7871-7880, (2020)
We present a new scheme for machine translation where a BART model is stacked above a few additional transformer layers
Cited by193BibtexViews615Links
0
0
ICLR, (2020)
We developed VL-BERT, a new pre-trainable generic representation for visuallinguistic tasks
Cited by49BibtexViews183Links
0
0
ICLR, (2020)
We find that BERT performance is being slightly harmed from the pre-train fine-tune mismatch from tokens, as Replace masked language modeling slightly outperforms BERT
Cited by44BibtexViews582Links
0
0
Gen Li,Nan Duan, Yuejian Fang,Ming Gong,Daxin Jiang
AAAI, pp.11336-11344, (2020)
Pretraining Unicoder-VL only slightly improves the performance. This might be because the pre-training task of image captioning is at the perceptual level, while the visual commonsense reasoning task is at the cognitive understanding level
Cited by40BibtexViews123Links
0
0
ACL, pp.270-278, (2020)
Neural response generation is a subcategory of text-generation that shares the objective of generating natural-looking text that is relevant to the prompt
Cited by36BibtexViews175Links
0
0
AAAI, pp.13041-13049, (2020)
This paper presents a unified Vision-Language Pre-training model that can be fine-tuned for both vision-language generation and understanding tasks
Cited by27BibtexViews159Links
0
0
KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Virtual Event..., pp.1192-1200, (2020)
We evaluate the LayoutLM model on three tasks: form understanding, receipt understanding and scanned document image classification
Cited by4BibtexViews222Links
0
0
EMNLP 2020, (2020)
We present POINTER, a simple yet novel insertion-based approach for hard-constrained text generation
Cited by2BibtexViews157Links
0
0
NeurIPS 2020, (2020)
We introduced a new approach to pre-training models for natural language understanding and generation, by using retrieved documents to reconstruct the original
Cited by0BibtexViews132Links
0
0
Kale Mihir
In this study we evaluated pre-training in the form of T5 for the data-to-text task
Cited by0BibtexViews89Links
0
0
Lv Shangwen, Wang Yuechen, Guo Daya,Tang Duyu,Duan Nan, Zhu Fuqing,Gong Ming,Shou Linjun, Ma Ryan,Jiang Daxin, Cao Guihong,Zhou Ming
We introduce a learning algorithm which regards the pre-training of text representations as modelagnostic meta-learning
Cited by0BibtexViews118Links
0
0
Kale Mihir, Roy Scott
In this work we investigated neural machine translation based transfer learning for data-to-text generation in non-English languages
Cited by0BibtexViews78Links
0
0
Xia Qiaolin, Huang Haoyang,Duan Nan,Zhang Dongdong, Ji Lei,Sui Zhifang, Cui Edward, Bharti Taroon,Zhou Ming
Compared to our baseline model that only uses text pre-training, crossmodal pre-training tasks improves the performance on all metrics, which validates the importance of Image-Text Pre-training for generation tasks
Cited by0BibtexViews100Links
0
0
We present several alternate ways of viewing Retrieval-Augmented Language Model that connect it to a broader set of ideas beyond Open-domain Question Answering: Language modeling with corpus as context Language representation models have been incorporating contexts of increasingl...
Cited by0BibtexViews118Links
0
0
Luo Huaishao, Ji Lei,Shi Botian, Huang Haoyang,Duan Nan,Li Tianrui, Chen Xilin,Zhou Ming
We find that 1) our pre-trained model can improve the performance to a large extent over the baseline models and achieve the stateof-the-art results on two typical multimodal tasks; 2) The pre-trained decoder can benefit the generation tasks such as captioning
Cited by0BibtexViews89Links
0
0
We demonstrate that multilingual de-noising pretraining is able to significantly improve both supervised and unsupervised machine translation at both the sentence level and document level
Cited by0BibtexViews121Links
0
0
Qi Di, Su Lin, Song Jia, Cui Edward, Bharti Taroon, Sachet Arun
We introduce a new vision-language pre-trained model -- ImageBERT -- for image-text joint embedding
Cited by0BibtexViews58Links
0
0
Raffel Colin, Shazeer Noam, Roberts Adam,Lee Katherine, Narang Sharan, Matena Michael, Zhou Yanqi,Li Wei,Liu Peter J.
While many modern approaches to transfer learning for natural language processing use a Transformer architecture consisting of only a single “stack”, we found that using a standard encoder-decoder structure achieved good results on both generative and classification tasks
Cited by396BibtexViews104Links
0
0
Jinhyuk Lee, Wonjin Yoon, Sungdong Kim,Donghyeon Kim,Sunkyu Kim, Chan Ho So,Jaewoo Kang
Bioinformatics (Oxford, England), no. 4 (2019): 1234-1240
Compared with most previous biomedical text mining models that are mainly focused on a single task such as named entity recognition or question answering, our model BioBERT achieves state-of-the-art performance on various biomedical text mining tasks, while requiring only minimal...
Cited by237BibtexViews128Links
0
0
international conference on machine learning, (2019)
We have proposed MAsked Sequence to Sequence pre-training: masked sequence to sequence pre-training for language generation tasks, which reconstructs a sentence fragment given the remaining part of the sentence in the encoder-decoder framework
Cited by145BibtexViews106Links
0
0
Keywords
Generic RepresentationNatural Language ProcessingNatural Language UnderstandingPre-trainingRepresentation LearningVisual-Linguistic
Authors
Ming Zhou
Paper 5
Nan Duan
Paper 4
Luke Zettlemoyer
Paper 4
Zhe Gan
Paper 4
Yizhe Zhang
Paper 4
Jianfeng Gao
Paper 3
Kenton C.T. Lee
Paper 3
Mingwei Chang
Paper 3
Mike Lewis
Paper 3