Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription
CoRR(2024)
摘要
State-of-the-art end-to-end Optical Music Recognition (OMR) has, to date,
primarily been carried out using monophonic transcription techniques to handle
complex score layouts, such as polyphony, often by resorting to simplifications
or specific adaptations. Despite their efficacy, these approaches imply
challenges related to scalability and limitations. This paper presents the
Sheet Music Transformer, the first end-to-end OMR model designed to transcribe
complex musical scores without relying solely on monophonic strategies. Our
model employs a Transformer-based image-to-sequence framework that predicts
score transcriptions in a standard digital music encoding format from input
images. Our model has been tested on two polyphonic music datasets and has
proven capable of handling these intricate music structures effectively. The
experimental outcomes not only indicate the competence of the model, but also
show that it is better than the state-of-the-art methods, thus contributing to
advancements in end-to-end OMR transcription.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要