We tackle the challenge of estimating grouping structures and factor loadings in asset pricing models, where traditional regressions struggle due to sparse data and high noise. Existing approaches, such as those using fused penalties and multi-task learning, often enforce coefficient homogeneity across cross-sectional units, reducing flexibility. Clustering methods (e.g., spectral clustering, Lloyd's algorithm) achieve consistent recovery under specific conditions but typically rely on a single data source. To address these limitations, we introduce the Panel Coupled Matrix-Tensor Clustering (PMTC) model, which simultaneously leverages a characteristics tensor and a return matrix to identify latent asset groups. By integrating these data sources, we develop computationally efficient tensor clustering algorithms that enhance both clustering accuracy and factor loading estimation. Simulations demonstrate that our methods outperform single-source alternatives in clustering accuracy and coefficient estimation, particularly under moderate signal-to-noise conditions. Empirical application to U.S. equities demonstrates the practical value of PMTC, yielding higher out-of-sample total $R^2$ and economically interpretable variation in factor exposures.
翻译:我们致力于解决资产定价模型中分组结构和因子载荷估计的挑战,传统回归方法因数据稀疏和高噪声而难以应对。现有方法(如使用融合惩罚和多任务学习的方法)通常强制横截面单元间的系数同质性,从而降低了灵活性。聚类方法(如谱聚类、Lloyd算法)在特定条件下可实现一致性恢复,但通常仅依赖单一数据源。为克服这些局限,我们提出了面板耦合矩阵张量聚类(PMTC)模型,该模型同时利用特征张量和收益矩阵来识别潜在资产分组。通过整合这些数据源,我们开发了计算高效的张量聚类算法,从而提升了聚类精度和因子载荷估计的准确性。模拟实验表明,我们的方法在聚类精度和系数估计方面优于单一数据源方法,尤其是在中等信噪比条件下。对美国股票数据的实证应用证明了PMTC的实用价值,其产生了更高的样本外总$R^2$以及具有经济可解释性的因子暴露变异。