학술논문

Data-Driven Anomaly Detection and Early Warning Issues
Document Type
Conference
Source
2022 2nd Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS) ACCTCS Communications Technology and Computer Science (ACCTCS), 2022 2nd Asia-Pacific Conference on. :423-430 Feb, 2022
Subject
Computing and Processing
Production systems
Time-frequency analysis
Fluctuations
Computational modeling
Time series analysis
Predictive models
Data models
Risk anomaly detection
box plot
TOPSIS model
entropy weight method
gray prediction model
Language
Abstract
This article is a research on data-driven anomaly detection and early warning problems. Through the application of box plots, the time series data recorded by the instruments and equipment in the production area of a production company at 00:00:00-22:59:59 on a certain day (data disconnection has been carried out) Min) in the number of data fluctuations screening, the fluctuations in the data classified as risk of abnormal data and non-risk abnormal data, in conjunction with various major cause of changes in data fluctuations might weights event of heavy, abnormal level of risk of abnormal data conducted Quantitative evaluation, and early warning of risk abnormalities based on the gray prediction model, combined with the TOPSIS model, gives a solution to evaluate the safety of the entire production system of the enterprise. First, the sensor according to pre-processing time-series data to determine the relevant data occurs due to abnormal hormone. Use box plots to find outliers in data that change with time. After removing the sensor numbers whose data does not change with time, make scatter plots and box plots respectively to find outliers and outliers. The outliers are deduplicated and the time number is reversed. Secondly, a strong correlation table is created and the index change rate is calculated. After distinguishing and judging the risk and non-risk abnormal values, the location of the abnormal points is indexed and determined. Second, use the entropy method to calculate the weight of each indicator twice, and obtain the value (evaluation object) of $\mathrm{m}\ (\mathrm{m}=100)$ sensor numbers from the known data, n time ($\mathrm{n}=5519$)) Is a matrix of evaluation indicators, and then standardizes and calculates the redundancy of information entropy, and then obtains the weight of each indicator at different times. On the basis of the weighted data obtained at different times, the degree of abnormality of the risky abnormal data is converted into a percentile system to score, and the ranking is performed using Excel, so as to realize the quantitative evaluation of the risky abnormal data. Third, use the abnormal degree of risky abnormal data as an indicator to establish a prediction model for the degree of risky abnormality in the production process, and use the historical data of the degree of risky abnormality to substitute into the data risk abnormal value evaluation system of question two to establish a preliminary gray prediction The model uses cubic spline function interpolation to improve the accuracy of the prediction model, and further calculates the data with the largest deviation in each time period and its corresponding abnormal sensor number. Finally, according to the model results of question 2 and question 3, we can get the frequency of different deviation degree within the data range every half an hour within 00:00.00-23:59:59, based on question 2 The entropy weight method model obtains the relative weights of the corresponding indicators, and at the same time considers the non-risk abnormal conditions, establishes a TOPSIS comprehensive evaluation model, and obtains the safety score of the enterprise production system. By comparing the change trend of the comprehensive score, describe the situation of the entire enterprise production system and observe the change relationship between various indicators. Verify that the conclusions obtained are accurate.