能够使信贷计分的机器学习算法 -- -- 清晰理解复杂预测模型的可解释人工智能(XAI)方法 (Enabling Machine Learning Algorithms for Credit Scoring -- Explainable Artificial Intelligence (XAI) methods for clear understanding complex predictive models)

MoDELS · Boosting（一种模型训练加速方式） · 可理解性 · TOOLS · XAI ·

2021 年 4 月 14 日

Enabling Machine Learning Algorithms for Credit Scoring -- Explainable Artificial Intelligence (XAI) methods for clear understanding complex predictive models

翻译：能够使信贷计分的机器学习算法 -- -- 清晰理解复杂预测模型的可解释人工智能(XAI)方法

Przemysław Biecek,Marcin Chlebus,Janusz Gajda,Alicja Gosiewska,Anna Kozak,Dominik Ogonowski,Jakub Sztachelski,Piotr Wojewnik

Rapid development of advanced modelling techniques gives an opportunity to develop tools that are more and more accurate. However as usually, everything comes with a price and in this case, the price to pay is to loose interpretability of a model while gaining on its accuracy and precision. For managers to control and effectively manage credit risk and for regulators to be convinced with model quality the price to pay is too high. In this paper, we show how to take credit scoring analytics in to the next level, namely we present comparison of various predictive models (logistic regression, logistic regression with weight of evidence transformations and modern artificial intelligence algorithms) and show that advanced tree based models give best results in prediction of client default. What is even more important and valuable we also show how to boost advanced models using techniques which allow to interpret them and made them more accessible for credit risk practitioners, resolving the crucial obstacle in widespread deployment of more complex, 'black box' models like random forests, gradient boosted or extreme gradient boosted trees. All this will be shown on the large dataset obtained from the Polish Credit Bureau to which all the banks and most of the lending companies in the country do report the credit files. In this paper the data from lending companies were used. The paper then compares state of the art best practices in credit risk modelling with new advanced modern statistical tools boosted by the latest developments in the field of interpretability and explainability of artificial intelligence algorithms. We believe that this is a valuable contribution when it comes to presentation of different modelling tools but what is even more important it is showing which methods might be used to get insight and understanding of AI methods in credit risk context.

翻译：快速开发先进的建模技术为开发越来越精确的工具提供了机会。然而,通常,每件事情都伴随着价格,在这种情况下,要付出的代价是模型在准确性和准确性上得到精确性,但模型的可解释性却松散。对于管理人员来说,控制并有效管理信用风险,而监管者则相信模型质量太高,那么要支付的价格就太高了。在本文中,我们展示了如何将信用评分评分评分推到下一个层次,即我们比较了各种预测模型(逻辑回归、物流回归,加上证据转换和现代人工智能算法的权重),并表明基于树的先进模型在预测客户违约方面产生最佳结果。更为重要和有价值的是,我们展示先进模型模型的先进模型如何提高先进模型模型,使信用风险从业者更容易理解,解决广泛部署更复杂的“黑箱”模型,如随机森林、梯度增强或极端梯度增强的树木等。所有这些都将在波兰信贷局获得的大型数据集上展示,但国家银行和大多数贷款公司在预测客户违约情况方面都有最佳结果。我们展示了最新的可解释性,在使用的最新数据工具时,用的是最新的评估工具。

相关内容

MoDELS

关注 30

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

127+阅读 · 2020年5月14日

【开放书】预测模型:探索、解释和调试，以人为本的可解释机器学习，Predictive Models: Explore, Explain, and Debug，Human-Centered Interpretable Machine Learning

专知会员服务

36+阅读 · 2019年12月26日