Rapid discovery of new reactions and molecules in recent years has been facilitated by the advancements in high throughput screening, accessibility to a much more complex chemical design space, and the development of accurate molecular modeling frameworks. A holistic study of the growing chemistry literature is, therefore, required that focuses on understanding the recent trends and extrapolating them into possible future trajectories. To this end, several network theory-based studies have been reported that use a directed graph representation of chemical reactions. Here, we perform a study based on representing chemical reactions as hypergraphs where the hyperedges represent chemical reactions and nodes represent the participating molecules. We use a standard reactions dataset to construct a hypernetwork and report its statistics such as degree distributions, average path length, assortativity or degree correlations, PageRank centrality, and graph-based clusters (or communities). We also compute each statistic for an equivalent directed graph representation of reactions to draw parallels and highlight differences between the two. To demonstrate the AI applicability of hypergraph reaction representation, we generate dense hypergraph embeddings and use them in the reaction classification problem. We conclude that the hypernetwork representation is flexible, preserves reaction context, and uncovers hidden insights that are otherwise not apparent in a traditional directed graph representation of chemical reactions.
翻译:近年来,高通量筛选、更加复杂的化学设计空间的获得以及精准的分子建模框架的发展促进了新反应和分子的快速发现。因此需要进行全面的研究,重点是理解最近的趋势并将其推广到可能的未来轨迹。为此,已经报道了几项基于网络理论的研究,这些研究使用化学反应的有向图表示。在这里,我们基于将化学反应表示为超图的研究,其中超边表示化学反应,节点表示参与分子。我们使用一个标准反应数据集构建超网络,并报告其统计数据,如度分布、平均路径长度、收敛性或度相关性、PageRank中心性和基于图的集群(或社区)。我们还计算了等效的化学反应有向图表示中的每个统计数据,以绘制并突出显示两者之间的相似之处和差异之处。为了展示超图反应表示的AI适用性,我们生成密集的超图嵌入,并将其用于反应分类问题。我们得出结论,超网络表示具有灵活性,可以保留反应上下文,并揭示传统化学反应有向图表示中未能展现的隐藏洞见。