학술논문

High-dimension, low-sample size perspectives in constrained statistical inference: the SARSCoV RNA genome in illustration
Document Type
Academic Journal
Source
Journal of the American Statistical Association. June, 2007, Vol. 102 Issue 478, p686, 9 p.
Subject
Functional equations -- Analysis
Functions -- Analysis
Genomes -- Health aspects
Genomics -- Health aspects
Language
English
ISSN
0162-1459
Abstract
High-dimensional categorical data models, often with inadequately large sample sizes, crop up in many fields of application. The SARS epidemic, originating in southern China in 2002, had an identified single-stranded and positive-sense RNA virus with large genome size and moderate mutation rate. The present genomic study is used as a prime illustration for motivating appropriate statistical methodology for comprehending the genomic variation in such high-dimensional categorical data models. Because of underlying restraints, a pseudomarginal approach based on Hamming distance is considered in a constrained statistical inference setup. The union-intersection principle and jackknifing methods are incorporated in exploring appropriate statistical procedures. KEY WORDS: Hamming distance; Marginal approach; Nuisance functional contours; Resampling plan; Union-intersection principle.