Delay-sensitive approaches for anonymizing numerical streaming data

International Journal of Information Security(2013)

引用 26|浏览13
暂无评分
摘要
Streaming data are widely used in today’s world. Data come from different sources in streams and must be processed online and with minimum delay. These data stream can contain confidential data such as customers’ purchase information and need to be mined in order to reveal other useful information like customers’ purchase patterns. Privacy preservation throughout these processes plays a crucial role. K-anonymity is a well-known technique for preserving privacy. The principle issues in k-anonymity are information loss and running time. Although some of the existing k-anonymity techniques are able to generate anonymized data with acceptable information loss, their main drawback is that they are very time-consuming and are not applicable in a streaming context since streaming data are usually very sensitive to delay and need to be processed quite fast. In [ 32 ], we proposed a cluster-based k-anonymity algorithm called fast anonymizing algorithm for numerical streaming data (FAANST) which can anonymize numerical streaming data quite fast while providing an admissible information loss. The main drawback of FAANST is that some tuples may remain in the system for a long time and are output when they might be considered to have expired. In this paper, we propose two extensions for FAANST, passive and proactive solutions. These two solutions put a soft deadline, called delay , on the time each tuple can stay in the system, and if a tuple passes this deadline, these algorithms force the tuple to be output. The proactive solution goes even one step further and utilizes a simple heuristic function to predict when a tuple in the system may expire and outputs the tuple if it will expire in the next round of the algorithm’s execution.
更多
查看译文
关键词
K-anonymity,Privacy-preserving data mining,Streaming data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要