A Multi-Scale Bimodal Fusion Network for Robust and Accurate Online Handwriting Recognition

Zhen Xu, Ziqiang Chen,Yaqiang Wu, Hui Li, Wanjun Lv,Lianwen Jin,Qianying Wang

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

Cited 0|Views12
No score
Online handwriting recognition based on sensor trajectory information faces several unresolved challenges: 1) sensor signals lack sufficient global spatial context; 2) different recognition tasks have inconsistent requirements for feature receptive fields. This is due to the inconsistent scales of the input sequences and the different semantic complexity of different language units. In this paper, we propose an online handwritten text recognition method based on multi-scale bimodal feature fusion to address these challenges. First, we employ sequence-generated pseudo-images to supplement the two-dimensional spatial information, and then extract multi-scale features from both trajectories and images simultaneously. Subsequently, our designed bimodal embedding learning module jointly learns feature embeddings for trajectories and images at different scales. These embeddings are then fed into a novel position-aware multi-scale fusion module to extract features for text prediction. The proposed modules effectively mitigate the issues of scales and semantics misalignment. Experimental results demonstrate significant performance improvements on various handwriting recognition datasets using our approach.
Translated text
Key words
Online Handwriting Recognition,Multi-scale Bimodal Fusion,Position-aware Multi-scale Fusion
AI Read Science
Must-Reading Tree
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined