Graph Embeddings for Non-IID Data Feature Representation Learning.

AusDM(2022)

引用 0|浏览2
暂无评分
摘要
Most machine learning models like Random Forest (RF) and Support Vector Machine (SVM) assume that features in the datasets are independent and identically distributed (IID). However, many datasets in the real world contain structural dependencies so neither the data observations nor the features satisfy this IID assumption. In this paper, we propose to incorporate the latent structural information in the data and learn the best embeddings for the downstream classification tasks. Specifically, we build traffic knowledge graphs for a traffic-related dataset and apply node2vec and TransE to learn the graph embeddings, which are then fed into three machine learning algorithms, namely SVM, RF, and kNN to evaluate their performance on various classification tasks. We compare the performance of these three classification models under two different representations of the same dataset: the first representation is based on traffic speed, volume, and speed limit; the second representation is the graph embeddings learned from the traffic knowledge graph. Our experimental results show that the road network information captured in the knowledge graphs is crucial for predicting traffic risk levels. Through our empirical analysis, we demonstrate knowledge graphs can be effectively used to capture the structural information in no-IID datasets.
更多
查看译文
关键词
Graph embedding, Knowledge graph, Non-IID
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要