학술논문

Data Streams Management: Multidimensional Summary with Big Data Tools
Document Type
Conference
Source
2022 5th International Conference on Computing and Big Data (ICCBD) Computing and Big Data (ICCBD), 2022 5th International Conference on. :50-55 Dec, 2022
Subject
Computing and Processing
Geography
Sentiment analysis
Memory
Machine learning
Big Data
Data models
Windows
data streams
summary
nosql
big data
multidimensional
Language
Abstract
Data streams are large, fast, varied and sometimes multidimensional. These characteristics are the reason why their processing and storage are a real challenge. In addition, they reduce the possibilities of querying them a posteriori. It therefore becomes necessary to set up summaries on these data streams for an analysis on the data already unloaded from the system. In this sense, many multidimensional summaries of data streams have been proposed but with some limits. Thus, this article aims to present a new multidimensional summary approach for big data streams called StreamCubeCascadeMode as well as functions that create, load, and update the summary when new events arrive in the system. This proposal is realized by the construction of data storage structures called cubes. These cubes are calculated at the expiration of a time window taken from a Tilted-time window. The solution is implemented with the help of Big data tools which allows it to optimize storage and processing resources by having the ability to scale.