598
Views
9
CrossRef citations to date
0
Altmetric
Editorial

Targeted next-generation sequencing: microdroplet PCR approach for variant detection in research and clinical samples

Pages 347-349 | Published online: 09 Jan 2014

Next-generation sequencing (NGS) approaches promise to revolutionize not only the way we conduct basic science research into the genomes of various organisms but also how we approach personalized genomic medicine Citation[1–4]. NGS is the most recent and, undoubtedly, the most significant step since the completion of the human genome, on the path to affordable, accurate and approachable whole human genome sequencing. The magical, and perhaps mythical, US$1000 ‘all-inclusive’ de novo human genome sequence is still many years away – primarily owing to the personnel costs associated with analysis of the massive data sets generated by NGS. However, targeted NGS-based resequencing is immediately clinically applicable and is arguably associated with several benefits over whole-genome sequencing.

Targeted resequencing is a hypothesis-, or even differential diagnosis-, driven application of NGS. A selected set of genomic loci are singled out for analysis from the rest of the genome using any one of several enrichment strategies based on the phenotypic presentation of the disease Citation[5]. This molecular diagnostics approach is powerful owing to the ability to leverage the scope of the benefits of NGS to examine the targeted regions as comprehensively as possible. Microdroplet PCR is an example of one such enrichment approach that was pioneered by RainDance Technologies, Inc. (RDT; MA, USA). This technology utilizes picoliter-sized droplets as individual reaction vessels to perform over 1 million unique PCR reactions per sample in less than 1 day. Other enrichment approaches make use of hybridization and can suffer from longer reaction times to achieve completion, restrictive design parameters (including problems with high GC or repetitive elements), and issues with reliably resolving gene family members that share sequence homology. The RDT approach requires just 2 µg of genomic DNA as input, an amount easily obtained from a single blood or saliva collection, and utilizes a user-specified primer library of up to 20,000 unique overlapping PCR amplicons (targeting ∼10 Mb of genomic sequence). Coupling of this approach with one of the emerging ‘personal’ NGS machines yields a targeted resequencing workflow that is theoretically capable of yielding fully analyzed results in 10–14 days, with approximately half of that time devoted to data generation and the remaining half to data analysis.

There is an ongoing debate in the field regarding the longevity of targeted resequencing in the face of the looming US$1000 genome sequence. In the author’s opinion, the time for targeted resequencing is now, and for the reasons listed in the following section, this approach for diagnostics should occupy a significant position in the market for the next decade, even when whole-genome sequencing becomes affordable and commonplace.

Accuracy & sensitivity

In a diagnostic setting, the false-positive and -negative rates must be kept as low as possible. By performing targeted resequencing one is able to essentially focus the NGS machine on the regions of the genome of most importance and therefore the average coverage per base can be several orders of magnitude higher versus a whole-genome sequencing approach. This means that the error rate associated with sequencing becomes less of a confounding factor for variant identification and that rare events (e.g., low-percentage mutant alleles from rare clones in a biopsy sample) are much more easily detected. Imagine a tumor sample comprised of a key clone that makes up just 2% of the total tumor mass. Even at a 50X genome coverage, one may only sequence this variant one time if employing a standard whole-genome sequencing approach. On the other hand, that same base pair might be sequenced 10–100 times or more when targeted resequencing is employed, yielding a collection of reads that can be easily leveraged to identify the variant arising from that rare clone. The depth of coverage is important for improved accuracy and sensitivity. It is possible that the 30X genome will reach US$1000 very soon, but how far away from $1000 is the 300X genome?

Speed of data generation & analysis

Speed is also important in the diagnostic setting, where getting patients onto the best medicine as quickly as possible is the goal. Targeted resequencing also presents some benefits in this area. To achieve high coverage, one needs to dedicate less sequencer space, therefore multiplexing approaches can be utilized to run hundreds of samples in parallel on a high-yield NGS machine or a smaller number of samples can be run on the emerging ‘personal’ NGS machines, which yield less data but can complete a single run in less than 1 day. In addition, since RDT utilizes a PCR-based approach for loci enrichment, one could imagine modifying the reaction primers to streamline library preparation in ways that other enrichment approaches cannot achieve. This has the potential to further dramatically reduce the data-generation time by having an impact on the wet laboratory sample preparation duration.

Analysis timeframes can also be shorter with targeted resequencing. Short read alignments can be conducted against just the targeted regions, thereby reducing computational burdens. The additional analysis approaches (such as variant identification and annotation) can run much more quickly owing to the smaller regions being investigated. Perhaps most importantly, the putative variant lists are smaller and the investigation of the top variants of interest by manual inspection is easier and can therefore be more comprehensive.

Cost

Admittedly, the cost per unique base pair interrogated by targeted resequencing is much higher when compared with whole-genome sequencing. However, the cost per sample is very similar to that which exists in the market today, and once one moves to greater numbers of targeted loci, that cost comparison again begins to swing in favor of targeted resequencing. For similar costs per sample you can collect much more information. For example, instead of gleaning information one gene at a time using PCR- and capillary-based sequencing approaches, one can target all of the known genes and perhaps other family and pathway members for approximately the same cost as the traditional single-gene test.

Companion product

One argument against targeted resequencing is that because it is by design hypothesis-based it is not set up to identify mutations in novel genes. It is only designed to scan through as many of the hypothesized or previously associated loci and perhaps other family or pathway members. On the other hand, whole-genome sequencing yields an all-inclusive look at the ‘sequenceable’ genome. Perhaps the best approach for any one sample is multipronged, including targeted resequencing, exome sequencing and whole-genome sequencing. By taking this path, one can help ensure that at the key regions of interest, the sequencing coverage will spike to a level that can help empower the analysis of rarer variants. The targeted sequencing panel is then just selected for each sample type, such as a cancer panel for a tumor sample, a ‘channelopathy’ panel for an epileptic patient, and so on. Each sample is also processed using one of the commercially available exome products and whole-genome sequencing. If one could add in RNA sequencing to this mix then the resulting molecular profile would be highly comprehensive. This potential function of targeted sequencing as a companion product would be important for the longevity of the approach. Of course, this approach of combining whole genome with exome and targeted panels increases the cost and analytical complexity of the dataset.

Why don’t the exome products address the same needs as the targeted sequencing panel? The answer to this question relates back to problems associated with depth of coverage at as many of the base pairs as possible for the key genes. The current exome products have known ‘holes’ in their coverage of the expressed sequences in the human genome. Missing even a few exons in a key disease-related gene presents a significant false-negative problem that the whole-genome coverage will probably not be able to address (see the earlier discussion). However, including the targeted resequencing panel will help to ensure that the needed bases are covered in the loci with prior association to the disease leaving the exome product to help increase coverage at the respectively targeted genome-wide exons. One could imagine the data resulting from this multipronged approach with spikes in coverage amid a typical whole-genome coverage background level with each of these spikes corresponding either to a gene included in a targeted resequencing or exome panel.

The maturation of NGS technology and expertise has ushered in an exciting era of personalized genomic medicine. There are still many hurdles to be overcome with these approaches, yet the initial successes are already tantalizing enough to invigorate the entire field with optimism for the future. Affordable whole-genome sequencing will be available to clinical practice in the coming years; however, it is clear that not one single approach for NGS data generation is comprehensive enough to stand alone. For the reasons discussed earlier, targeted resequencing is ready, and is in fact already being deployed in the diagnostic setting. RDT-based loci enrichment is a leading approach in the field and the coupling of targeted sequencing approaches (loci- and exome-based) with whole-genome sequencing helps to ensure adequate coverage across the regions of interest to identify even the rarest of variants and opens up the possibility to identify novel changes if the targeted sequencing panel fails to yield a clear ‘smoking gun’. NGS, be it targeted or whole genome, is already advancing genomic medicine into an exciting next phase and that should be considered good news for all of us.

Financial & competing interests disclosure

Matthew J Huentelman acknowledges support for his work in the area of targeted next-generation resequencing from the NIH-NINDS (R01-NS059873), the Arizona Alzheimer’s Consortium and the state of Arizona, USA. He discloses that his work has been supported in part through a donation of experimental reagents from RainDance Technologies, Inc. (MA, USA) but indicates that this donation did not influence experimental design, interpretation of the data or the content of this article in any fashion. The author has no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

No writing assistance was utilized in the production of this manuscript.

References

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.