Abstract
Very little information has been accumulated regarding the likely accuracy of final or consensus DNA sequence data. With the large-scale efforts anticipated for the Human Genome Project, the subjective determination of final sequence must eventually be replaced with more objective, automatic methods. This will require a much better understanding of the nature of error in raw sequencing data and its impact on the determination of the final sequence. This paper describes a start at defining the error model of large-scale sequencing efforts based on random subcloning strategies.
Key Words: