Design Domain Specific Neural Network via Symbolic Testing

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(2022)

引用 0|浏览26
暂无评分
摘要
Deep sequence networks such as multi-head self-attention networks provide a promising way to extract effective representations from raw sequence data in an end-to-end fashion and have shown great success in various domains such as natural language processing, computer vision, $etc$. However, in domains such as financial risk management and anti-fraud where expert-derived features are heavily relied on, deep sequence models struggle to dominate the game.In this paper, we introduce a simple framework called symbolic testing to verify the learnability of certain expert-derived features over sequence data. A systematic investigation over simulated data reveals the fact that the self-attention architecture fails to learn some standard symbolic expressions like the count distinct operation. To overcome this deficiency, we propose a novel architecture named SHORING, which contains two components:event network andsequence network. Theevent network efficiently learns arbitrary high-orderevent-level conditional embeddings via a reparameterization trick while thesequence network integrates domain-specific aggregations into the sequence-level representation, thereby providing richer inductive biases compare to standard sequence architectures like self-attention. We conduct comprehensive experiments and ablation studies on synthetic datasets that mimic sequence data commonly seen in anti-fraud domain and three real-world datasets. The results show that SHORING learns commonly used symbolic features well, and experimentally outperforms the state-of-the-art methods by a significant margin over real-world online transaction datasets. The symbolic testing framework and SHORING have been applied in anti-fraud model development at Alipay and improved performance of models for real-time fraud-detection.
更多
查看译文
关键词
neural network,symbolic,testing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要