Keynote Talk 2 Training Large Language Models: Challenges and Opportunities.

IEEE International Parallel and Distributed Processing Symposium (IPDPS)(2022)

引用 0|浏览19
暂无评分
摘要
Language models with large number of parameters trained on massive datasets can achieve state-of-the-art accuracies in various natural language processing applications including summarization, automatic dialogue generation, translation, semantic search, and code autocompletion. However, training such models is challenging as these models no longer fit in the largest GPU memory and can require a very long training time. Therefore, numerous innovations and breakthroughs are required in dataset, algorithms, software, and hardware altogether to make training these models a reality. In this talk, I present our efforts to train the Megatron-Turing Natural Language Generation model (MT-NLG), the largest and the most powerful monolithic transformer language model trained to date, with 530 billion parameters. I will also showcase several applications of MT-NLG and discuss future research and numerous opportunities that this model presents.
更多
查看译文
关键词
automatic dialogue generation,powerful monolithic transformer language model,natural language processing applications,megatron-turing natural language generation model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要