Abstract
Phylogenetic tree inference using deep DNA sequencing is reshaping our understanding of rapidly evolving systems, such as the within-host battle between viruses and the immune system. Densely sampled phylogenetic trees can contain special features, including sampled ancestors in which we sequence a genotype along with its direct descendants, and polytomies in which multiple descendants arise simultaneously. These features are apparent after identifying zero-length branches in the tree. However, current maximum-likelihood based approaches are not capable of revealing such zero-length branches. In this article, we find these zero-length branches by introducing adaptive-LASSO-type regularization estimators for the branch lengths of phylogenetic trees, deriving their properties, and showing regularization to be a practically useful approach for phylogenetics. Supplementary materials for this article are available online.
Acknowledgments
The authors would like to thank Vladimir Minin and Noah Simon for helpful discussions, and Sidney Bell for helping with the Dengue sequence data.