Maximum entropy methods, based on the inverse Ising/Potts problem from statistical mechanics, are essential for modeling interactions between pairs of variables in data-driven problems across disciplines such as bioinformatics, ecology, and neuroscience. Despite their considerable success, these methods typically fail to capture higher-order interactions that are often essential for understanding complex systems. Conversely, modern machine learning methods capture these complex interactions, but the computational cost of interpretable frameworks makes them impractical for real-world applications. Restricted Boltzmann Machines (RBMs) provide a computationally efficient way to capture statistical correlations using hidden nodes in a bipartite neural network. In this study, we introduce a new method that maps RBMs to generalized Potts models, allowing for the extraction of interactions up to any specified order. This method utilizes large-$N$ approximations, enabled by the RBM's simple structure, to extract effective many-body couplings with minimal computational effort. Furthermore, we propose a robust framework for extracting higher-order interactions in more complex probabilistic models and a simple gauge-fixing method within the effective many-body Potts model. Our validation on synthetic datasets confirms the method's ability to recover two- and three-body interactions accurately. When applied to protein sequence data, the framework competently reconstructs protein contact maps and provides performance comparable to the best inverse Potts models. These findings confirm that RBMs are an effective and streamlined tool for exploring higher-order interactions within complex systems.
翻译:暂无翻译