Fast Vocabulary Transfer for Language Model Compression
EMNLP (Industry Track)(2024)
摘要
Real-world business applications require a trade-off between language model
performance and size. We propose a new method for model compression that relies
on vocabulary transfer. We evaluate the method on various vertical domains and
downstream tasks. Our results indicate that vocabulary transfer can be
effectively used in combination with other compression techniques, yielding a
significant reduction in model size and inference time while marginally
compromising on performance.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要