학술논문

Statistical Data Analysis using GPT3: An Overview
Document Type
Conference
Source
2022 IEEE Bombay Section Signature Conference (IBSSC) Bombay Section Signature Conference (IBSSC), 2022 IEEE. :1-6 Dec, 2022
Subject
Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Analytical models
Data analysis
Correlation
Statistical analysis
Null value
Companies
Predictive models
GPT3
prompts
Data Insights
Data Analysis using GPT-3
Language
Abstract
Though automated statistics has started gaining some momentum in the field of data analysis, it is not unified and very slow with large datasets. Due to computing limitations or lack of specific domain knowledge, general statistics have been used most commonly. But now research advisors are attracted towards a machine learning-based approach for statistical analysis of Data Sets which may help bridge gaps between traditional approaches like correlation matrices, p-values, etc., and new models like GPT3. This paper proposes a novel approach for the analysis of large datasets which uses GPT3 to predict insights from calculated statistics of data. The research addresses the limitations of existing methods and proposes a novel framework to analyze large statistical data sets, which solves many computationally challenging problems in efficient ways. Our proposed method works on top of GPT3's features, where it learns to predict individual words from particular parts of the dataset you pass as prompts (cumulative sums/means etc.) enabling us to analyze extremely large datasets such as telecom churn or census data. A comparison of traditional methods, statistical analysis, and machine learning approaches with GPT3 will be made. Furthermore, a discussion on the pros and cons of using GPT3 for this research is also discussed from the point of view of performance, accuracy, and reliability concerns.