Money laundering is a profound global problem. Nonetheless, there is little scientific literature on statistical and machine learning methods for anti-money laundering. In this paper, we focus on anti-money laundering in banks and provide an introduction and review of the literature. We propose a unifying terminology with two central elements: (i) client risk profiling and (ii) suspicious behavior flagging. We find that client risk profiling is characterized by diagnostics, i.e., efforts to find and explain risk factors. On the other hand, suspicious behavior flagging is characterized by non-disclosed features and hand-crafted risk indices. Finally, we discuss directions for future research. One major challenge is the need for more public data sets. This may potentially be addressed by synthetic data generation. Other possible research directions include semi-supervised and deep learning, interpretability, and fairness of the results.
翻译:洗钱是一个全球性的严重问题,但在反洗钱方面,统计和机器学习方法的科学文献却很少。本文侧重于探讨银行领域内的反洗钱问题,并对已有文献进行了介绍和评估。我们提出了一个统一的术语,包括两个核心要素:(i)客户风险评估和(ii)可疑行为标识。我们发现,客户风险评估的特征在于诊断,即寻找和解释风险因素。而可疑行为标识则具有非公开特征和手工制作的风险指数。最后,我们讨论了未来研究的方向,其中一个重要挑战是需要更多的公共数据集。这可能可以通过合成数据生成来实现。其他可能的研究方向包括半监督和深度学习、可解释性和结果的公正性。