학술논문

Simultaneous prediction of multiple outcomes using revised stacking algorithms
Document Type
Working Paper
Source
Subject
Quantitative Biology - Quantitative Methods
Computer Science - Machine Learning
Statistics - Machine Learning
Language
Abstract
Motivation: HIV is difficult to treat because its virus mutates at a high rate and mutated viruses easily develop resistance to existing drugs. If the relationships between mutations and drug resistances can be determined from historical data, patients can be provided personalized treatment according to their own mutation information. The HIV Drug Resistance Database was built to investigate the relationships. Our goal is to build a model using data in this database, which simultaneously predicts the resistance of multiple drugs using mutation information from sequences of viruses for any new patient. Results: We propose two variations of a stacking algorithm which borrow information among multiple prediction tasks to improve multivariate prediction performance. The most attractive feature of our proposed methods is the flexibility with which complex multivariate prediction models can be constructed using any univariate prediction models. Using cross-validation studies, we show that our proposed methods outperform other popular multivariate prediction methods. Availability: An R package will be made available.
Comment: 15 pages, 5 figures