How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task.

Rahul Aralikatte, Héctor Ricardo Murrieta Bello,Daniel Hershcovich,Marcel Bollmann,Anders Søgaard

WAT@ACL/IJCNLP（2021）

引用 0|浏览2

暂无评分

摘要

This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the WAT 2021 leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known short-comings of current translation metrics.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要