争取全面了解与相互抵触的预培训前的数学问题 (Towards a Holistic Understanding of Mathematical Questions with Contrastive Pre-training)

Understanding mathematical questions effectively is a crucial task, which can benefit many applications, such as difficulty estimation. Researchers have drawn much attention to designing pre-training models for question representations due to the scarcity of human annotations (e.g., labeling difficulty). However, unlike general free-format texts (e.g., user comments), mathematical questions are generally designed with explicit purposes and mathematical logic, and usually consist of more complex content, such as formulas, and related mathematical knowledge (e.g., Function). Therefore, the problem of holistically representing mathematical questions remains underexplored. To this end, in this paper, we propose a novel contrastive pre-training approach for mathematical question representations, namely QuesCo, which attempts to bring questions with more similar purposes closer. Specifically, we first design two-level question augmentations, including content-level and structure-level, which generate literally diverse question pairs with similar purposes. Then, to fully exploit hierarchical information of knowledge concepts, we propose a knowledge hierarchy-aware rank strategy (KHAR), which ranks the similarities between questions in a fine-grained manner. Next, we adopt a ranking contrastive learning task to optimize our model based on the augmented and ranked questions. We conduct extensive experiments on two real-world mathematical datasets. The experimental results demonstrate the effectiveness of our model.

翻译：有效地理解数学问题是一项至关重要的任务,它能够有益于许多应用,例如困难估计等。研究人员已经非常注意设计培训前模型,以便因缺少人类说明(例如标签困难)而进行问题陈述。然而,与一般的自由格式文本(例如用户评论)不同,数学问题一般设计有明确的目的和数学逻辑,通常包含更为复杂的内容,如公式和相关的数学知识(例如功能)等。因此,整体上代表数学问题的问题仍未得到充分探讨。为此,我们在本文件中提议对数学问题表述采取新的对比性培训前方法,即Quesco,试图使问题更接近相似的目的。具体地说,我们首先设计两个层次的问题增强,包括内容层次和结构层次,产生具有类似目的的完全不同的问题配对。然后,为了充分利用知识概念的等级信息,我们建议一种知识等级-认知级战略(KHAR),以精细的方式排列问题之间的相似之处。我们随后采用了一种比较式的预选前培训前方法,即试图使问题更加接近。我们首先设计出两个层次的问题,包括内容层次和结构层面,从而产生非常不同的问题。然后,我们真正的实验性地实验性地试验了我们的数据等级,以最优化的模型。