A new approach called NAF (the Neural Attention Forest) for solving regression and classification tasks under tabular training data is proposed. The main idea behind the proposed NAF model is to introduce the attention mechanism into the random forest by assigning attention weights calculated by neural networks of a specific form to data in leaves of decision trees and to the random forest itself in the framework of the Nadaraya-Watson kernel regression. In contrast to the available models like the attention-based random forest, the attention weights and the Nadaraya-Watson regression are represented in the form of neural networks whose weights can be regarded as trainable parameters. The first part of neural networks with shared weights is trained for all trees and computes attention weights of data in leaves. The second part aggregates outputs of the tree networks and aims to minimize the difference between the random forest prediction and the truth target value from a training set. The neural network is trained in an end-to-end manner. The combination of the random forest and neural networks implementing the attention mechanism forms a transformer for enhancing the forest predictions. Numerical experiments with real datasets illustrate the proposed method. The code implementing the approach is publicly available.
翻译:本論文提出了一種名為NAF(神經注意力森林)的新方法,用於處理表格訓練數據下的回歸和分類任務。所提出的NAF模型的主要思想是通過將一個特定形式的神經網絡計算的注意力權重分配給決策樹的葉子和隨機森林本身,在Nadaraya-Watson核回歸的框架中引入注意機制。與可用模型(如基於注意力的隨機森林)不同,注意力權重和Nadaraya-Watson回歸以可訓練的參數形式表示為神經網絡的權重。對於所有樹木訓練的共享權重的第一部分計算葉子數據的注意力權重,第二部分聚合樹網絡的輸出,旨在最小化隨機森林預測和訓練集真實目標值之間的差異。神經網絡以端到端的方式訓練。隨機森林和實現注意機制的神經網絡的組合形成了一個Transformer,用於增強森林的預測能力。使用實際數據集進行的數值實驗說明了所提出的方法。實現該方法的代碼公開可用。