6,573
Views
0
CrossRef citations to date
0
Altmetric
Review

Using Large-Scale Genomics Data to Identify Driver Mutations in Lung Cancer: Methods and Challenges

, , , , &
Pages 1149-1160 | Published online: 31 Jul 2015

Figures & data

Figure 1. Summary schematic highlighting the factors leading to biases and heterogeneity of mutational data.

In areas of uniform mutational rates longer coding genes will demonstrate a higher mutational frequency if the data are not length corrected. Genes that are not expressed and those that replicate late in the cell cycle will have higher mutational rates. Genes with large GC-rich regions will have inadequate sequencing coverage and potential mutations will be missed leading to an underreporting of mutations in these genes.

Figure 1. Summary schematic highlighting the factors leading to biases and heterogeneity of mutational data.In areas of uniform mutational rates longer coding genes will demonstrate a higher mutational frequency if the data are not length corrected. Genes that are not expressed and those that replicate late in the cell cycle will have higher mutational rates. Genes with large GC-rich regions will have inadequate sequencing coverage and potential mutations will be missed leading to an underreporting of mutations in these genes.

Table 1A. Top 20 frequently mutated genes in squamous lung cancer with number of mutated cases and longest corresponding protein length. 

Table 1B. Top 20 frequently mutated genes in squamous lung cancer normalized for length of longest protein and ranked by length corrected score. 

Table 2. Mutation predictor data for the three pathogenic mutations discovered with a targeted siRNA screen [Citation61].