Remix.run Logo
Bostonian 18 hours ago

Quoting the paper:

Our approach consists of two stages: First, we input a large amount of unstructured news text into GraphRAG, where during the clustering and retrieval process, global statistics are dynamically updated to accurately cluster relationships between entities. This results in a directed knowledge graph of companies, with edges representing both direct and indirect relationships between companies. In the second stage, inspired by Ibrahim et al. (2022) and Barigozzi & Brownlees (2019), we transform the knowledge graph into a mask matrix using the Leiden algorithm Traag et al. (2019). This matrix is then used as a regularizer in the multivariate time series model, combined with relevant covariates. This method effectively enhances the generalization ability of models on scarce company data and significantly improves prediction performance, as illustrated in Figure 1. The specific algorithm details are provided in Algorithms 1 and 2.