500
Views
5
CrossRef citations to date
0
Altmetric
Editorial

Genome network medicine: new diagnostics and predictive tools

Pages 643-646 | Published online: 09 Jan 2014

Molecular diagnostics and predictive tools are based on reductionist approaches. However, molecular mechanisms driving biological systems, gene expression and cellular and tissue regulation in human physiology and disease are substantially more complex than it was thought. Yet understanding structural and functional heterogeneity of the human genome with phenotypically the same disease is poor with accurate diagnostics discovery to be a major goal of emerging biomedical research. The potential of dramatic advances in genome, computers, network and mathematical sciences Citation[1–8] integrated into ‘Genome Network Medicine’ (GNM) is summarized. This emerging effort, described for the first time in this article, emphasizes myriad challenges but at the same time an unmet need for ‘clinical innovation’ Citation[9] to achieve medical advances in diagnostics and therapeutics.

Personal next-generation genome sequencing

Assessing the enormous genome-wide association studies data derived from linear experimentation on genomic localization of genetic variation and moving forward to explore genome–scale interactions network of coding and non-coding sequences represents the next ambitious goal of next-generation human genome projects and other large-scale genomic studies. It is now the time to integrate new knowledge on genome science using emerging innovative methods in dynamics of network biology into clinical medicine for improving healthcare. Indeed, the Encyclopedia of DNA Elements (ENCODE) project Citation[1] mapping the functional components of human genome using high-throughput (HT) next-generation sequencing (NGS) coupled with array technologies is being changing our concept for biomedical research on human biology and disease. Some of the ENCODE data provide much innovation including the functionality of most non-coding sequence Citation[1], the notion shifting from gene to transcript as the fundamental genome unit Citation[2], the genomic localization of transcription factors (TFs)-binding sites and their regulatory networks Citation[3–5] and the dynamics of 3D regulation of complex gene interactions at chromatin and chromosomes level Citation[6].

To reach the new era of molecular diagnostics, it appears essential to unravel molecular connectivity of all these circuits recently revealed by combining novel computational algorithms, and nanowire-based perturbation tools Citation[7]. These new methods can synthesize a comprehensive visualization of interacting networks Citation[7]. A global genome control through protein–protein interactions (PPIs), TFs regulatory networks and dynamics of 3D gene circuitry reflects the concept of ‘hierarchy’ Citation[4] or ‘cloud’ network Citation[8] with interconnections and looping connectivity of components of all these networks Citation[9].

Slow progress with reductionist-based diagnostics

The whole building of current diagnostic and therapeutic medicine is based on advances on single gene-single phenotype/disease dogma. Over the past century, linear experimentation approach has focused on a single component of a system of interest ignoring multiple interactions affecting the outcome of a system such as ecosystem or a whole organ or human. Half a century after DNA double helix discovery, important advances in basic reductionist sciences have been translated into effective diagnostics and therapeutics in clinical medicine.

However, it appears that we have reached a critical threshold beyond which medical progress will be slow. This notion is supported by several arguments. First, progress in diagnostic accuracy and biomarkers discovery for predicting either risks of common diseases or drugs response is modest. A large number of traits- and disease-associated genetic variants identified by >1200 genome-wide association studies (GWAS) conferring small size effect cannot explain missing heritability or provide any clinical utility Citation[10]. More recently, entering into the era of NGS-based personal whole-genome sequencing (WGS) and whole-exome sequencing (WES), some promises are provided by the 1000 genome project by identifying 38 million genetic variants some of which are rare mutations and thus may be associated with large size effect Citation[11]. We should however await whether based on this resource, new disease association studies can establish novel linear genotype–phenotype relationship useful for the clinic.

Second, understanding the network-based approaches to human disease Citation[12] is essential for identifying interactions-based diagnostics which will guide the next-generation therapeutic targets discovery. Indeed, pharmaceutical industry and academia based on single-gene approach has resulted in a long list of drugs which however don’t cure complex diseases Citation[13]. All available drugs are based on simplified ‘one-gene target’ and ‘one-size-fits-all’ concepts ignoring structural and functional genomic heterogeneity Citation[14].

Third, evidence for the crucial role of biological networks driving cellular response to external biochemical signals and/or internal aberration (e.g., genetic and epigenetic alterations) has dramatically been increased over the recent few years Citation[12,14]. The most remarkable paradigm is the ENCODE project. Although this consortium started in 2003 with standard reductionist approach, it was increasingly becoming clear that the necessity of applying network principles to go ahead after the advent of NGS in 2005 Citation[8].

Translating network theory into medical diagnostics

Although GWAS and next-generation personal whole-genome sequencing have provided thousands Citation[10] to millions Citation[11] of disease-associated genetic variants, respectively, little is known about how genome sequence differences affect genome function leading to pathologic cellular behavior and disease. Latest evidence from GWAS analysis reveals that the vast majority, in up to 93% of common and rare genetic variants lies within regulatory DNA and not in protein-coding DNA Citation[3,15]. This non-coding, disease association localization of genetic variation affects gene expression which is controlled through TFs regulatory networks Citation[4,5] and other looping biological circuits Citation[16].

Network analysis represents a graph in which nodes can be genes, proteins, metabolites or other molecules and links or edges biomolecular interactions. Several terms used in network hypothesis such as hubs, disease module, degree of distribution and others useful for the readers have extensively been described Citation[14]. Under a global view, several classes of networks including PPIs networks, regulatory networks such TFs networks, RNA networks, gene–gene networks and metabolic networks have been intensively evaluated over the last decade Citation[4,5,9,14–17].

Comprehensive view of biological networks

Evidence is being accumulating that interacting biological networks rather and not their components lone orchestrate gene expression, cell behavior, human biology, health and disease. Dramatic shifting is therefore being now observed to understand how molecular circuits guide health and disease. However, till recently this effort has focused separately on the exploration of individual regulatory elements, such as miRNAs and TFs including histones and their networks Citation[17]. Now efforts are been done to achieve a global view on how all biological systems such regulatory DNA, TFs regulatory networks, miRNAs and long non-coding RNA (ncRNA) will simultaneously affect gene and cell control. For example, it is important to explore how mutations affect proteins structure. But perhaps much more crucial for clinical success will be to understand how the whole intracellular signaling pathway circuitry, including PPIs network, protein–DNA binding physical interactions and gene–gene network is deregulated as a whole system in disease.

A plethora of PPI data at genome-scale using recent techniques, such as the yeast two-hybrid assays and affinity purification have been collected from several databases. The more recent PrePPI database Citation[18] has integrated all predicted and experimental PPI data sets. A Bayesian framework has dramatically increased the number of interactions to approximately 2 million PPIs for several species and approximately 370,000 PPIs, the largest number of PPIs available, for human Citation[18].

Despite advances in studying static networks, such as PPIs using a Bayesian framework Citation[18], there has been a need to develop new methods for exploring biological systems governing the dynamics of components interactions. Two of the more commonly used dynamic approaches are the Boolean networks (BN) and the implementation of ordinary differential equations (ODEs). Both networks approaches have their own strengths and limitations. In contrast to qualitative approach limitation of BNs, ODEs represents a more sophisticated, quantitative, dynamic approach. Promising future directions will continue to build upon networks, which integrate the ODE and BN approaches together Citation[17].

Transcriptional circuitry

Although crucial for gene expression regulation and disease, TFs and their regulatory networks are being poorly understood. Characterizing transcripts and mapping TFs and transcriptional machinery could be approached only recently by novel techniques. Genome-scale exploration by both RNA-seq and chromatin immunoprecipitation and high-throughput sequencing (ChIP-seq) Citation[4] and genomic DNase I foot-printing Citation[3,5] empowers TFs-binding sites recognition giving rise to global TFs regulatory networks mapping Citation[8]. Localization, molecular characterization and function of transcription start site of the complex RNA polymerase II (Pol II) and TFs were considered elusive. Now, using cryo-electron microscopy, He et al. reported the structural visualization of key steps in human transcription initiation Citation[19].

The ENCODE data provide different sets of novel knowledge including transcription changing the basis of human biology and probably the way disease will be managed in the future. A central assessment is that not only protein-coding DNA transcription into mRNA and translation into proteins but also non-coding sequences are crucial for transcription and gene regulation. Therefore, not only inherited variants and somatic mutation falling within protein-coding sequences can modify proteins structure affecting PPIs signaling network, but also such alterations within non-coding promoters and enhancers are critical for understanding and treating disease. This new insight revealing the functionality of 80% of the genome Citation[1], changing the dogma of ‘junk’ DNA, has more recently translated into medical research revealing how genetic variants in non-coding region of proximal and distal elements of the genes they transcribe can deregulate gene expression networks causing disease Citation[3]. Mapping 119 human TFs and their ChIP-seq binding sites at specific genomic locations proximal and distal to genes they transcribe, Gerstein et al. proposed a ‘hierarchy’ of transcriptional regulatory networks with top-level TFs to have more strong influence on gene expression than other TFs levels Citation[4]. Instead of ChIP-seq, Neph et al. used genomic DNase I foot-printing for mapping 475 sequence-specific TFs and to analyze the dynamics of these connections across 41 diverse cell and tissue types Citation[5,16].

Collectively, TFs interact with one another at three major systems including direct PPIs, by DNA-binding interactions within the same cis-regulatory element and cross-regulatory interactions resulting from the binding of one TF within the regulatory DNA regions controlling another factor.

Circuitry of genes, non-coding RNA & chromatin–transcription interaction

Apart form proteins physical interactions (PPIs, protein–DNA binding), chromosome conformation capture (3C) carbon copy (5C) has enabled the assessment of genes themselves (gene–gene interactions network) including spatial proximity and distal-specific long-range interactions between genomic elements across the genome. More than 1000 long-range interactions for each cell lines have recently been reported with promoters and distal elements forming complex networks Citation[6].

With the functionality of ncRNA, the ENCODE data also changed the concept of the fundamental unit of genomic organization suggesting that this unit is transcript and not the gene Citation[2]. These data argue that genes represent a higher-order framework around which individual transcripts coalesce, creating a polyfunctional entity Citation[2]. Recognizing the critical role of RNA transcripts in pervasive transcription and disease, small ncRNAs, termed as miRNAs, have extensively been studied and despite promises their clinical implementation still remains unclear Citation[20], long ncRNAs (lncRNAs) are now being evaluated to provide new insights into their role in gene regulation and clinical medicine. Although an annotation of thousands of lncRNAs of human genome by the GENCODE Citation[21] has been started, it is based on reductionist approaches and there remains a challenge as to how these ncRNAs could be integrated into nodes to construct and infer complex regulatory RNA networks.

Chromatin–transcription network revealed by the ENCODE data Citation[1,8] represent a continuum of dynamically exchanged information between chromatin and transcription multiplying the challenges to interrogate the topology and dynamics of global signaling transduction network regulating gene expression Citation[22].

Future of clinical GNM & conclusion

The term GNM introduced here summarizes an emerging approach and focuses on a comprehensive picture of interacting biological networks regulating simultaneously gene expression and cell function. During the long evolutionary history, protective mechanisms through more and more complex interactome have been developed to overcome mutations Citation[23]. Given the connectivity of these regulatory network components, a global circuitry view appears nearly essential for clinical success.

Therefore, an emerging scientific and technological exciting field is now being shaped with in silico and imaging approaches to dominate the effort to globally explore the whole biological network. Computational and mathematical strategies integrate biological networks including PPIs, TFs regulatory networks, ncRNAs, chromatin–transcription and 3D gene networks to construct and predict collectively this overall network Citation[22]. More recently, the CLARITY method has been proposed that is capable of providing structural and molecular information that may support integrative understanding of large-scale intact biological systems Citation[24]. Despite advances, human global circuitry visualization, understanding and prediction is still in its infancy.

Molecular diagnostics discovery research has entered into the post-ENCODE era. Emerging goal is to understand how protein-coding and non-coding sequence changes affect global gene expression deregulating genome function and cellular signaling circuitry leading to multifactorial, polygenic, polyfunctional and dynamic common disease. The challenge of disease-associated localization of inherited and somatic mutations appears a realistic goal. But to discover molecular diagnostics with clinical utility based on how genomic localization of causal mutations deregulates the dynamics of the whole biological network will require innovation in both technology and science.

Financial & competing interests disclosure

The author has no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending or royalties.

No writing assistance was utilized in the production of this manuscript.

References

  • ENCODE Project Consortium, Dunham I, Kundaje A, Aldred SF et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57–74 (2012).
  • Djebali S, Davis CA, Merkel A et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012).
  • Maurano MT, Humbert R, Rynes E et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337(6099), 1190–1195 (2012).
  • Gerstein MB, Kundaje A, Hariharan M et al. Architecture of the human regulatory network derived from ENCODE data. Nature 489(7414), 91–100 (2012).
  • Neph S, Vierstra J, Stergachis AB et al. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature 489(7414), 83–90 (2012).
  • Sanyal A, Lajoie BR, Jain G, Dekker J. The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012).
  • Yosef N, Shalek AK, Gaublomme JT. Dynamic regulatory network controlling TH17 cell differentiation. Nature 496(7446), 461–468 (2013).
  • Stamatoyannopoulos JA. What does our genome encode? Genome Res. 22(9), 1602–1611 (2012).
  • Roukos DH. Disrupting cancer cells’ biocircuits with interactome-based drugs: is ‘clinical’ innovation realistic? Expert Rev. Proteomics 9(4), 349–353 (2012).
  • Li MJ, Wang P, Liu X et al. GWASdb: a database for human genetic variants identified by genome-wide association studies. Nucleic Acids Res. 40( Database issue),D1047–D1054 (2012).
  • 1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491(7422), 56–65 (2012).
  • Janjić V, Pržulj N. Biological function through network topology: a survey of the human diseasome. Brief. Funct. Genomics 11(6), 522–532 (2012).
  • Rask-Andersen M, Almén MS, Schiöth HB. Trends in the exploitation of novel drug targets. Nat. Rev. Drug Discov. 10(8), 579–590 (2011).
  • Barabási AL, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12, 56–68 (2011).
  • Schaub MA, Boyle AP, Kundaje A, Batzoglou S, Snyder M. Linking disease associations with regulatory information in the human genome. Genome Res. 22(9), 1748–1759 (2012).
  • Neph S, Stergachis AB, Reynolds A, Sandstrom R, Borenstein E, Stamatoyannopoulos JA. Circuitry and dynamics of human transcription factor regulatory networks. Cell 150(6), 1274–1286 (2012).
  • Cheng TM, Gulati S, Agius R, Bates PA. Understanding cancer mechanisms through network dynamics. Brief. Funct. Genomics 11(6), 543–560 (2012).
  • Zhang QC, Petrey D, Garzón JI, Deng L, Honig B. PrePPI: a structure-informed database of protein-protein interactions. Nucleic Acids Res. 41(D1), D828–D833 (2013).
  • He Y, Fang J, Taatjes DJ, Nogales E. Structural visualization of key steps in human transcription initiation. Nature 495, 481–486 (2013).
  • Nair VS, Maeda LS, Ioannidis JP. Clinical outcome prediction by microRNAs in human cancer: a systematic review. J. Natl. Cancer Inst. 104(7), 528–540 (2012).
  • Frankish JA, Gonzalez JM. GENCODE: The reference human genome annotation for The ENCODE Project. Genome Res. 22(9), 1760–1774 (2012).
  • Roukos DH, Ziogas DE, Baltogiannis GG. Novel next-generation sequencing and networks-based therapeutic targets: realistic more effective drug design and discovery. Curr. Pharm. Des. (2013) ( In Press).
  • Fernãndez A, Lynch M. Non-adaptive origins of interactome complexity. Nature 18(7352), 474502–474505 (2011).
  • Chung K, Wallace J, Kim SY. Structural and molecular interrogation of intact biological systems. Nature 497, 332–337 (2013).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.