How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task.

WAT@ACL/IJCNLP(2021)

引用 0|浏览2
暂无评分
摘要
This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the WAT 2021 leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known short-comings of current translation metrics.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要