학술논문

LF-Transformer: Latent Factorizer Transformer for Tabular Learning
Document Type
Periodical
Author
Source
IEEE Access Access, IEEE. 12:10690-10698 2024
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Transformers
Data models
Deep learning
Motion pictures
Boosting
Tuning
Recommender systems
Tabular dataset
tabular learning
self-attention
row-wise attention
column-wise attention
matrix factorization
Language
ISSN
2169-3536
Abstract
The field of deep learning for tabular datasets has made significant strides in recent times. Previously, gradient boosting and decision tree algorithms had been the go-to options for processing such datasets due to their superior performance. However, deep learning has now reached a level of development where it can compete with these algorithms on equal footing. Accordingly, we propose latent factorizer transformer (LF-Transformer). Our proposing method, LF-Transformer, involves applying the transformer architecture to columns and rows of a given dataset to identify the attention latent factor matrix. This matrix is then used for prediction. The process is akin to matrix factorization, which involves breaking down the original matrix into a latent matrix and then reconstructing it again. Our experimental results indicate that the LF-Transformer approach outperforms general feature embedding methods, providing superior feature presentation. Additionally, the approach has demonstrated a relative superiority in regression and classification across various datasets that we have tested. In conclusion, the LF-Transformer presents a promising direction for deep learning approaches in tabular datasets. Its ability to identify latent factors and provide superior performance in regression and classification makes it a compelling alternative to traditional algorithms.