학술논문

Analyzing Data Sets for ML-driven Fraud Detection in SAP Systems
Document Type
Conference
Source
2023 IEEE International Conference on Big Data (BigData) Big Data (BigData), 2023 IEEE International Conference on. :3314-3324 Dec, 2023
Subject
Bioengineering
Computing and Processing
Geoscience
Robotics and Control Systems
Signal Processing and Analysis
Soft sensors
Companies
Feature extraction
Enterprise resource planning
Fraud
Stakeholders
Decision trees
Fraud Detection
Forensic Data Analysis
SAP
Machine Learning
Gradient Boosting Decision Trees
Language
Abstract
Enterprise Resource Planning (ERP) systems are used by companies to support and automate business processes. Users need to be granted the necessary permissions to be able to perform their work. Following the principle of least privilege, these permissions shall restrict the access to such information and resources only, which are required to complete the tasks involved. However, even using a well-attuned authorization concept, some users may still misuse the ERP system to enrich themselves. Besides a reduction in profit, companies suffer a loss of reputation and trust from their stakeholders. Furthermore, they may be faced with lawsuits from aggrieved customers or suppliers. Such occupational fraud shall therefore be traced and tracked down. Since all business operations are recorded within an ERP system, a variety of different data sources is available. This paper explores the wealth of data sources found in SAP ERP, the most widespread ERP system from SAP, the world’s leading vendor of ERP systems. It examines, in how far such data sources are suited for fraud detection. Previous literature is surveyed. It turns out that the applicability of the data used in previous work is limited or that no suitable data sources are available for developing and evaluating fraud detection techniques. Therefore, this paper proposes three data sets extracted from SAP ERP. The data sets are made available via GitHub and machine learning techniques are applied to evaluate the adequacy for fraud detection. In addition, the feature importance is examined to increase the transparency of fraud detection.