Multimodal Generative Models for Bankruptcy Prediction Using Textual Data
arxiv(2022)
摘要
Textual data from financial filings, e.g., the Management's Discussion
Analysis (MDA) section in Form 10-K, has been used to improve the prediction
accuracy of bankruptcy models. In practice, however, we cannot obtain the MDA
section for all public companies, which limits the use of MDA data in
traditional bankruptcy models, as they need complete data to make predictions.
The two main reasons for the lack of MDA are: (i) not all companies are obliged
to submit the MDA and (ii) technical problems arise when crawling and scrapping
the MDA section. To solve this limitation, this research introduces the
Conditional Multimodal Discriminative (CMMD) model that learns multimodal
representations that embed information from accounting, market, and textual
data modalities. The CMMD model needs a sample with all data modalities for
model training. At test time, the CMMD model only needs access to accounting
and market modalities to generate multimodal representations, which are further
used to make bankruptcy predictions and to generate words from the missing MDA
modality. With this novel methodology, it is realistic to use textual data in
bankruptcy prediction models, since accounting and market data are available
for all companies, unlike textual data. The empirical results of this research
show that if financial regulators, or investors, were to use traditional models
using MDA data, they would only be able to make predictions for 60
companies. Furthermore, the classification performance of our proposed
methodology is superior to that of a large number of traditional classifier
models, taking into account all the companies in our sample.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要