Improving stock trend prediction with pretrain multi-granularity denoising contrastive learning

Knowledge and Information Systems(2023)

引用 0|浏览9
暂无评分
摘要
Stock trend prediction (STP) aims to predict price fluctuation, which is critical in financial trading. The existing STP approaches only use market data with the same granularity (e.g., as daily market data). However, in the actual financial investment, there are a large number of more detailed investment signals contained in finer-grained data (e.g., high-frequency data). This motivates us to research how to leverage multi-granularity market data to capture more useful information and improve the accuracy in the task of STP. However, the effective utilization of multi-granularity data presents a major challenge. Firstly, the iteration of multi-granularity data with time will lead to more complex noise, which is difficult to extract signals. Secondly, the difference in granularity may lead to opposite target trends in the same time interval. Thirdly, the target trends of stocks with similar features can be quite different, and different sizes of granularity will aggravate this gap. In order to address these challenges, we present a self-supervised framework of multi-granularity denoising contrastive learning (MDC). Specifically, we construct a dynamic dictionary of memory, which can obtain clear and unified representations by filtering noise and aligning multi-granularity data. Moreover, we design two contrast learning modules during the fine-tuning stage to solve the differences in trends by constructing additional self-supervised signals. Besides, in the pre-training stage, we design the granularity domain adaptation module (GDA) to address the issues of temporal inconsistency and data imbalance associated with different granularity in financial data, alongside the memory self-distillation module (MSD) to tackle the challenge posed by a low signal-to-noise ratio. The GDA alleviates these complications by replacing a portion of the coarse-grained data with the preceding time step’s fine-grained data, while the MSD seeks to filter out intrinsic noise by aligning the fine-grained representations with the coarse-grained representations’ distribution using a self-distillation mechanism. Extensive experiments on the CSI 300 and CSI 100 datasets show that our framework stands out from the existing top-level systems and has excellent profitability in real investing scenarios.
更多
查看译文
关键词
Contrastive learning,Multi-granularity data,Memory,Denoising,Stock trend prediction,Pre-training
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要