Multi-scale Spatio-temporal Transformer-based Imbalanced Longitudinal Learning for Glaucoma Forecasting from Irregular Time Series Images
CoRR(2024)
摘要
Glaucoma is one of the major eye diseases that leads to progressive optic
nerve fiber damage and irreversible blindness, afflicting millions of
individuals. Glaucoma forecast is a good solution to early screening and
intervention of potential patients, which is helpful to prevent further
deterioration of the disease. It leverages a series of historical fundus images
of an eye and forecasts the likelihood of glaucoma occurrence in the future.
However, the irregular sampling nature and the imbalanced class distribution
are two challenges in the development of disease forecasting approaches. To
this end, we introduce the Multi-scale Spatio-temporal Transformer Network
(MST-former) based on the transformer architecture tailored for sequential
image inputs, which can effectively learn representative semantic information
from sequential images on both temporal and spatial dimensions. Specifically,
we employ a multi-scale structure to extract features at various resolutions,
which can largely exploit rich spatial information encoded in each image.
Besides, we design a time distance matrix to scale time attention in a
non-linear manner, which could effectively deal with the irregularly sampled
data. Furthermore, we introduce a temperature-controlled Balanced Softmax
Cross-entropy loss to address the class imbalance issue. Extensive experiments
on the Sequential fundus Images for Glaucoma Forecast (SIGF) dataset
demonstrate the superiority of the proposed MST-former method, achieving an AUC
of 98.6
generalization capability on the Alzheimer's Disease Neuroimaging Initiative
(ADNI) MRI dataset, with an accuracy of 90.3
Alzheimer's disease prediction, outperforming the compared method by a large
margin.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要