Cancer does not represent a single disease. Rather, cancer is a myriad collection of complex diseases with as many different manifestations as there are tissues and cell types in the human body, involving innumerable endogenous or exogenous carcinogenic agents, and various etiological mechanisms. What all of these disease states share in common are certain biological properties of the cells that compose tumors, including unregulated (clonal) cell growth, impaired cellular differentiation, invasiveness, and metastatic potential. While cancer has been known to exist in human populations since the first civilizations began to record history, the nature of the disease, its probable causes, and the mechanisms (molecular and cellular) through which malignant tumors develop, have only recently begun to be understood in any appreciable detail. It is now recognized that cancer, in its simplest form, is a genetic disease, or more precisely, a disease of abnormal gene expression. Recognition of this basic tenet of cancer pathobiology has resulted in vigorous research effort over the last 25 years to define the molecular basis of cancer. Early experimental approaches utilized to address the molecular mechanisms of neoplastic transformation provided useful information but were labor-intensive and produced results slowly.Citation[1]Citation[2]Citation[3] While standard techniques such as the Southern blot and northern blot are still employed in the research laboratory today, more rapid and adaptable methods based upon polymerase chain reaction (PCR) have replaced blotting as the technical standard.Citation[4]Citation[5]Citation[6]Citation[7] Significant advances in methodology have enabled the implementation of practical high throughput technologies based upon cDNA or oligonucleotide microarrays for analysis of gene expression,Citation[8]Citation[9]Citation[10]Citation[11] analysis of sequence polymorphisms and mutations,Citation[12]Citation[13] and sequencing.Citation[14] Likewise, automation of sequencing technology led to high volume DNA sequence analysis of the human genome and the genomes of experimental organisms.Citation[15]Citation[16]Citation[17] These high throughput and high volume technologies rapidly produce vast amounts of information, requiring computer-based information technology for efficient data management (storage, organization, manipulation, analysis, and retrieval). The value of large-scale datasets (like that of the sequence of the human genome) extends well beyond the laboratory that generated the raw data. To achieve maximum effective usefulness, these datasets must be integrated, properly maintained for accuracy and completeness,Citation[18] and accessible to cancer researchers for use in their own studies.Citation[20] Thus, in addition to problems associated with data acquisition and management, the challenge for bioinformatics in recent years has included the development of user-friendly computer-based platforms that can be accessed and utilized by the average researcher for searching, retrieval, manipulation, and analysis of information from large-scale datasets.Citation[20]Citation[21] A number of computer-based resources have appeared over the years, some of which were limited in their capabilities, others that were very versatile. One major recognized problem in the current era of computer-assisted investigations of cancer pathobiology is that a significant percentage of researchers that rely upon DNA sequence (or other) information are not proficient in their utilization of publicly accessible databases. Thus, easy to use, computer-based tools that facilitate access to information contained databases, which contain practical functionality, are needed that draw upon existing technology contained in average research labs and do not require specialized training or expertise for their use.
In this issue of Cancer Investigation, BuetowCitation[22] describes the NCI Center for Bioinformatics (NCICB), a new endeavor of the National Cancer Institute that provides support for several of its key initiatives, including 1) the Cancer Genome Anatomy Project, 2) the Molecular Classification of Cancer, 3) mouse models, and 4) clinical trials. The major purpose of the NCICB is to provide the appropriate means to capture, integrate, and redistribute research data generated through NCI research initiatives. To accomplish this goal, the NCICB has established several repositories of information (including experimental protocols, data, data analysis, and other shared resources), internet portals for access, and computer-based tools for searching, retrieving, and manipulating the data contained in these databases. Further, the NCICB has extended these resources in the establishment of the Cancer Molecular Analysis Project, the goal of which is to identify and evaluate molecular targets in cancer. The Cancer Molecular Analysis Project draws from a number of valuable cancer research resources, including databases from both NIH and non-NIH institutions, that combine to generate a computer-based infrastructure for accessing integrated cancer information and data on molecular profiles, molecular targets, targeted agents, and trials of targeted agents. This project is of immense value to both the NCI and the cancer researcher.
Current investigations by basic scientists and clinical scientists are aimed at characterization and clarification of the numerous molecular pathways to neoplastic transformation,Citation[23] development of new methodologies for identification of molecular markers of disease,Citation[24] and exploitation of molecular markers of neoplasia for the development of molecular assays that are useful for detection, diagnosis, or prognostication of human cancers.Citation[25]Citation[26] The development of new internet-based bioinformatics tools for cancer researchers, coupled with increasing availability and access to databases maintained by public and private entities, and expansion of information networks to encompass more individual laboratories and investigators, will ultimately result in the more rapid and efficient exchange of information generated by cancer researchers. The formation of the NCICB and open availability of its databases, information networks, and web-based tools represent important steps that will significantly strengthen the computational capabilities of research laboratories worldwide. These new resources will ultimately impinge in a positive manner on our efforts to understand cancer pathogenesis and pathobiology; and to translate this understanding into tools for detection, diagnosis, and appropriate treatment of cancer.
References
- Esch R. K. Basic nucleic acid procedures. Molecular Diagnostics for the Clinical Laboratorian, W. B. Coleman, G. J. Tsongalis. Humana Press, Totowa, NJ 1997; 35–60
- Presnell S. C. Nucleic acid blotting techniques: theory and practice. Molecular Diagnostics for the Clinical Laboratorian, W. B. Coleman, G. J. Tsongalis. Humana Press, Totowa, NJ 1997; 63–87
- Tsongalis G. J., Coleman W. B. Molecular oncology: new developments in the application of molecular technologies in the diagnosis and prognosis of human cancers. Cancer Investig. 1998; 16: 485–502, [CSA]
- Saiki R. K., Gelfand D. H., Stoffel S., Scharf S. J., Higuchi R., Horn G. T., Mullis K. B., Erlich H. A. Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 1988; 239: 487–491, [PUBMED], [INFOTRIEVE]
- Foord O. S., Rose E. A. Long-distance PCR. Genome Res. 1994; 3: S149–161
- Heid C. A., Stevens J., Livak K. J., Williams P. M. Real time quantitative PCR. Genome Res. 1996; 6: 986–994, [PUBMED], [INFOTRIEVE], [CSA]
- Presnell S. C. Essential concepts and techniques in molecular biology. The Molecular Basis of Human Cancer, W. B. Coleman, G. J. Tsongalis. Humana Press, Totowa, NJ 2002; 25–42
- Golub T. R., Slonim D. K., Tamayo P., Huard C., Gaasenbeek M., Mesirov J. P., Coller H., Loh M. L., Downing J. R., Caligiuri M. A., Bloomfield C. D., Lander E. S. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999; 286: 531–537, [PUBMED], [INFOTRIEVE], [CROSSREF]
- Brown P. O., Botstein D. Exploring the new world of the genome with DNA microarrays. Nat. Genet. 1999; 21: 33–37, [PUBMED], [INFOTRIEVE], [CROSSREF]
- Garber M. E., Troyanskaya O. G., Schluens K., Petersen S., Thaesler Z., Pacyna-Gengelbach M., van de Rijn M., Rosen G. D., Perou C. M., Whyte R. I., Altman R. B., Brown P. O., Botstein D., Petersen I. Diversity of gene expression in adenocarcinoma of the lung. Proc. Natl. Acad. Sci. U. S. A. 2001; 98: 13784–13789, [PUBMED], [INFOTRIEVE], [CSA], [CROSSREF]
- Bhattacharjee A., Richards W. G., Staunton J., Li C., Monti S., Vasa P., Ladd C., Beheshti J., Bueno R., Gillette M., Loda M., Weber G., Mark E. J., Lander E. S., Wong W., Johnson B. E., Golub T. R., Sugarbaker D. J., Meyerson M. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. U. S. A. 2001; 98: 13790–13795, [PUBMED], [INFOTRIEVE], [CSA], [CROSSREF]
- Hacia J. G. Resequencing and mutational analysis using oligonucleotide microarrays. Nat. Genet. 1999; 21: 42–47, [PUBMED], [INFOTRIEVE], [CROSSREF]
- Hacia J. G., Collins F. S. Mutational analysis using oligonucleotide microarrays. J. Med. Genet. 1999; 36: 730–736, [PUBMED], [INFOTRIEVE], [CSA]
- Drmanac R., Drmanac S. Sequencing by hybridization arrays. Methods Mol. Biol. 2001; 170: 39–51, [PUBMED], [INFOTRIEVE]
- Waterston R. H., Lander E. S., Sulston J. E. On the sequencing of the human genome. Proc. Natl. Acad. Sci. U. S. A. 2002; 99: 3712–3716, [PUBMED], [INFOTRIEVE], [CSA], [CROSSREF]
- Lindblad-Toh K., Lander E. S., McPheerson J. D., Waterston R. H., Rodgers J., Birney E. Progress in sequencing the mouse genome. Genes. J. Genet. Dev. 2001; 31: 137–141, [CSA]
- Goffeau A. Four years of post-genomic life with 6000 yeast genes. FEBS Lett. 2000; 480: 37–41, [PUBMED], [INFOTRIEVE], [CSA], [CROSSREF]
- Pennisi E. Keeping genome databases clean and up to date. Science 1999; 286: 447–450, [PUBMED], [INFOTRIEVE], [CROSSREF]
- Boguski M. S. Biosequence exegesis. Science 1999; 286: 453–455, [PUBMED], [INFOTRIEVE], [CROSSREF]
- Varmus H. Genomic empowerment: the importance of public databases. Nat. Genet. Suppl. 2002; 32: 3, [CROSSREF]
- Roos D. S. Bioinformatics—trying to swim in a sea of data. Science 2001; 291: 1260–1261, [PUBMED], [INFOTRIEVE], [CROSSREF]
- Buetow K. H. The NCI Center for Bioinformatics (NCICB): building a foundation for in-silico biomedical research. Cancer Invest 2004; 22(1), in press, [CROSSREF]
- Coleman W. B., Tsongalis G. J. The role of genomic instability in the development of human cancer. The Molecular Basis of Human Cancer, W. B. Coleman, G. J. Tsongalis. Humana Press, Totowa, NJ 2002; 115–142
- Jimenez-Sanchez G., Childs B., Valle D. Human disease genes. Nature 2001; 409: 853–855, [CROSSREF]
- Evans W. E., Relling M. V. Pharmacogenomics: translating functional genomics into rational therapeutics. Science 1999; 286: 487–491, [PUBMED], [INFOTRIEVE], [CROSSREF]
- Futreal P. A., Kasprzyk A., Birney E., Mullikin J. C., Wooster R., Stratton M. R. Cancer and genomics. Nature 2001; 409: 850–852, [PUBMED], [INFOTRIEVE], [CROSSREF]