학술논문

Scalable Semi-Supervised Graph Learning Techniques for Anti Money Laundering
Document Type
Periodical
Source
IEEE Access Access, IEEE. 12:50012-50029 2024
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Topology
Reliability
Fraud
Data models
Vectors
Synthetic data
Semisupervised learning
Anti-money laundering
machine learning on graphs
graph embedding
Language
ISSN
2169-3536
Abstract
Money laundering is the process by which criminals move large sums of illicit money to hidden locations and integrate them as legal funds through existing financial services. The United Nations (UN) estimates that 2 to 5% of global GDP, which is approximately 0.8 to 2.0 trillion dollars, is laundered globally every year. Therefore, accurately identifying such globally alarming activities is crucial for enforcing anti-money laundering (AML) measures. Numerous techniques have been proposed to detect money laundering from transaction graphs of money transfers between bank accounts by analysing the structural and behavioural dynamics of their corresponding dense subgraphs. However, these techniques often do not consider that money laundering usually involves high-volume flows of funds through chains of bank accounts. Moreover, most AML approaches either result in lower detection accuracy or incur higher computational costs, making them less reliable and suitable for real financial systems. Consequently, only a fraction of money laundering activities can be detected and prevented. In this paper, we propose an efficient approach to AML by employing semi-supervised graph learning techniques on a large-scale financial transactional graph in both pipeline settings (i.e., graph embedding models are first trained to generate node embeddings that are combined with additional topological graph features to train binary classifiers) and end-to-end settings (i.e., node classification is performed by training SkipGCN, FastGCN, and EvolveGCN without requiring separate classifiers) to identify nodes involved in potential money laundering activities. We evaluate our approach on four datasets: AMLSim, Elliptic, IBM AML, and SynthAML, with a view to scalability and practicality for real financial systems. Further, we provide local (e.g., how money is laundered between nodes) and global (e.g., what factors contribute to money laundering) explanations of the predictions by highlighting the predominant factors in money laundering cases and elucidating the mechanisms of illicit fund transfers between nodes to enhance the interpretability and transparency of the AML models. Experimental results suggest that our approach is scalable and effective at detecting money laundering from real and synthetic transaction graphs.