Multi-modal Medical Data Exploration Based on Data Lake.

Tao Zhao, Nan Hai, Wenyao Li,Wenkui Zheng,Yong Zhang,Xin Li, Gao Fei

HIS(2023)

引用 0|浏览2
暂无评分
摘要
In the field of medicine, the rapid increase of medical devices has generated a substantial volume of multi-modal data, encompassing structured, semi-structured, and unstructured formats. Data fusion and exploration are crucial to enable medical professionals to integrate and locate specific datasets efficiently. However, traditional relational databases exhibit poor query performance when dealing with large-scale data, and data warehouse platforms struggle to effectively integrate diverse and comprehensive multi-source heterogeneous medical data while maintaining efficiency. This paper proposes a distributed computing and storage strategy based on Data Lake technology to address these challenges. The data lake platform offers a solution for storing and integrating multi-modal data from various sources and powerful data exploration capabilities that facilitate the rapid identification of desired datasets. Experimental results indicate that data lake technology outperforms traditional relational databases, significantly reducing storage space requirements and query processing time. As a result, it enables the rapid processing of large-scale medical data, demonstrating excellent performance in medical data management and analysis tasks.
更多
查看译文
关键词
data lake,exploration,multi-modal
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要