학술논문

Management, Storage, and Retrieval of Complex Data Comprising Multiple Formats Collected from Different Sources: A Systems Engineering Approach
Document Type
Conference
Source
2023 Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE) CSCE Computer Science, Computer Engineering, & Applied Computing (CSCE), 2023 Congress in. :1419-1424 Jul, 2023
Subject
Computing and Processing
Training
Machine learning algorithms
Target recognition
High performance computing
Memory
Computer architecture
Machine learning
big data
data lake
infrared imagery
ATR
machine learning
iRODS
Language
Abstract
Physical modeling procedures, with intermediate data, are being developed for the large-scale generation of synthetic imagery for automated target recognition (ATR) machine learning (ML) algorithms. This imagery is typically combined with collected data for generating robust training sets. The management and retrieval of this data requires large-scale storage and a means to query data of different types. Queries need to be performed for selection of data sets to the single file. The goal of this study is the creation of managed system for storing and retrieving this information using high-performance computing resources with the integrated Rule Oriented Data System (iRODS). Search oriented metadata tags are created for query searches based on locality, time-of-day, and other factors. When possible, metadata generation will be automated based on information in the data file. Use cases for the import and query operations are created. Simple scalable problems have been processed and are presented for this data set procedure and the proposed architecture is presented. This data storage and retrieval system will serve to provide locality specific data for ATR ML data-sets from a large set of collected and synthetic imagery and the processes to create that imagery.