Abstract
High-dimensional statistical problems arise in the investigation of the relationship between reduced sensitivity to antiretroviral drugs among human immunodeficiency virus-infected patients and viral genotypic patterns obtained from blood samples. This article develops a nonparametric approach for analyzing gene region heterogeneity associated with drug-resistance phenotype. The method is based on the distribution of distances between viral genetic sequences. The distance measures used are sufficiently flexible to allow weighting of locations within a gene region, as well as weighting of residue types within a location. The weighting may reflect covariability between locations and between residues within a location. The approach to inference presented extends U statistic theory to multivariate one- and two-sample cases, which leads to exact tests based on permutation theory and their asymptotic counterparts. These methods are applied to data from a study conducted by the AIDS Clinical Trials Group that investigated altered viral susceptibility to protease inhibitor drugs.