5,162
Views
0
CrossRef citations to date
0
Altmetric
Commentary

Pharmacogenomics and Bioinformatics: Pharmgkb

, &
Pages 501-505 | Published online: 29 Mar 2010

Abstract

The NIH initiated the PharmGKB in April 2000. The primary mission was to create a repository of primary data, tools to track associations between genes and drugs, and to catalog the location and frequency of genetic variations known to impact drug response. Over the past 10 years, new technologies have shifted research from candidate gene pharmacogenetics to phenotype-based pharmacogenomics with a consequent explosion of data. PharmGKB has refocused on curating knowledge rather than housing primary genotype and phenotype data, and now, captures more complex relationships between genes, variants, drugs, diseases and pathways. Going forward, the challenges are to provide the tools and knowledge to plan and interpret genome-wide pharmacogenomics studies, predict gene–drug relationships based on shared mechanisms and support data-sharing consortia investigating clinical applications of pharmacogenomics.

Past

The Pharmacogenomics Knowledge Base (PharmGKB) began in 2000 as one of the first ‘post-genomic‘ databases Citation[1]. At that time, there was no standard format for the description and storage of genotype and phenotype data from pharmacogenetic studies. An important challenge was to maintain the quality of data without compromising the privacy of subjects Citation[2]. As data collection methods advanced and pharmacogenomics outpaced pharmacogenetics, the PharmGKB adapted to increasing volumes of data and new ways to present this. Relationships were built with other resources, such as the University of California Santa Cruz (CA, USA) Genome Browser Citation[3], Drugbank Citation[4] and Biopax Citation[5], to enhance the knowledge by selecting, aggregating and annotating data relevant to pharmacogenomics .

Figure 1. The PharmGKB homepage has targeted searching.

The gene search box returns links to genes associated with the search term. Example queries can be viewed and include: search by gene symbol, drug, disease, drug/disease combination, Entrez Gene ID, PubMed ID and protein name Citation[102].

Figure 1. The PharmGKB homepage has targeted searching.The gene search box returns links to genes associated with the search term. Example queries can be viewed and include: search by gene symbol, drug, disease, drug/disease combination, Entrez Gene ID, PubMed ID and protein name Citation[102].

Initially, the main aim was gathering highly detailed primary data from the community at large and specifically, the Pharmacogenetics Research Network Citation[6]; more than ten groups across the USA spanning a variety of different gene, drug and disease interests from asthma to thiopurine S-methyltransferase. The PharmGKB team worked with these groups to define gene variant data in detail, and how it was obtained, which formed the PharmGKB XML schema Citation[7]. These schema allowed many useful comparisons, such as sequencing data from one researcher with RFLPs to another group working on the same gene and to see frequencies of variants in and across sample sets.

Defining phenotypes in a computationally robust manner was another challenge. Several vocabularies and ontologies were tested for describing clinical entities for the electronic medical record, but these lacked the kind of molecular detail needed for many of the Pharmacogenetics Research Network studies. The idea was to be able to integrate data across several studies, such as combining irinotecan area under the curve data from patients in a study at the University of Chicago (IL, USA) with those from Washington University (MO, USA). The computational challenges in combining such data are substantial; data may be provided in different units and collected under different conditions. The challenge of standardization requires trained curators who understand the relevant phenotypes. Data standardization can be different for every dataset, and is expensive and not possible for all phenotype data. Therefore, the PharmGKB adopted a two-level procedure. Curators capture and tag metadata for all submitted studies without a default effort in standardization. If there is a dataset of particular importance and impact, it is curated to enable comparisons across studies, for example, international normalized ratios and genotypes across the datasets from the International Warfarin Pharmacogenetics Consortium Citation[8].

As a central knowledge-sharing site, PharmGKB noticed an opportunity to facilitate data-sharing consortia, in which investigators with complementary data create a collaboration based on a common scientific interest and the ability to combine datasets. PharmGKB then uses its curatorial staff to integrate, aggregate and annotate the contributing datasets. We have facilitated the formation of several consortia (for the pharmacogenomics of warfarin, tamoxifen and irinotecan), bringing together groups to create diverse sample sets that provide greater statistical power to detect complex associations Citation[9]. The success of these consortia relies on a trustworthy framework for collaboration, as participants are often scientific competitors in other venues. PharmGKB involvement ensures high-quality curation, including the development of a standard template to capture the data integrating and recoding, formatting to allow comparison across the many groups and annotating with metadata to allow for computational searching. Most importantly, PharmGKB acts as an independent party and has developed a reputation as a dependable and scientifically neutral collaborator.

Present

Over the last decade, PharmGKB has collected and annotated pharmacogenomic data from a variety of sources. The published literature is a major source of knowledge, but the volume of papers is so vast that finding the information is cumbersome. We have developed structures to tag and describe relationships in the literature such that they can be found by search mechanisms but also still understood by readers. Gene, drug, disease and variant relationships have been identified and labeled with categories of interest (clinical outcomes; pharmacodynamics [PD]; pharmacokinetics [PK]; cellular and molecular functional assays and genotype data) Citation[10]. The data are accessed from the related gene, drug and disease tabs on individual gene, drug and disease pages. Top gene pages include CYP2D6, ABCB1 and CYP2C9; the top drug pages visited are warfarin, amiodarone and clopidogrel and top diseases include Torsades de pointes, breast neoplasms and epilepsy . We have over 4000 literature annotations (as of 17th November 2009) that link gene, drug and disease relationships. Natural language processing is used to streamline the identification of articles of interest to annotate and we are developing tools to speed up the annotation process Citation[11].

From knowledge of the literature, PharmGKB scientists develop and maintain drug pathways with production-quality graphics and supporting scientific evidence. The PharmGKB currently has 60 curated pathways (as of 17th November 2009) illustrating PD and/or PK aspects for over 180 drugs. The top pathways viewed include platelet aggregation PD pathway, codeine and morphine PK pathway and nicotine PK pathway . The repository of relationships built from the literature annotations now allows us to generate automated networks that can be used to start new PharmGKB pathways or downloaded for users to explore with their own methods.

PharmGKB curators have not only annotated gene–drug relationships, but have annotated specific human variations of importance to pharmacogenomics in the ‘variant annotation project‘ Citation[12]. Curators summarize the findings of pharmacogenomic relevance regarding a genomic variant and associate these with the appropriate genes, drugs and diseases. Mapping the genomic variants is not as trivial a task; many papers either do not include dbSNP identifiers or have them hidden within the methods sections, or, when they are used, the authors often neglect to specify which base is associated with the phenotype Citation[13]. We have built a dictionary to attempt to cross-reference the various names for variants used in the literature and databases (3000 variant annotations as of 17th November, 2009). We are participating in efforts being made by the biocuration community to require inclusion of standard identifiers for variants, such as dbSNP rs number, in publications. We also write detailed online summaries of very important pharmacogenes (41 to date) and their variants, many of which have also been published Citation[14–23].

Our user interface now makes searching easier for people to get directly to genes, variants, drugs and pathways of interest. For example, users can easily find genes related to their drug of interest by entering the drug name in the gene search box or view annotations and frequency data for variants related to their drug of interest by searching with the variant box.

PharmGKB has received overwhelmingly positive feedback from users regarding the usefulness of PharmGKB in research, as well as educational programs and presentations. PharmGKB is used to introduce the concept of pharmacogenomics to students in medicine, pharmacy, genetics, toxicology and public health, as well as for the continuing education of medical professionals, including physicians, pharmacists and nurses. We have also used expertise from PharmGKB to pilot a pharmacogenomics project for high school students and teachers. DNATwist Citation[101] is an interactive website that introduces basic concepts of pharmacogenomics that is also being adapted for use at the Tech Museum of Innovation in California (CA, USA) Citation[24].

Future

The field of pharmacogenomics is at a critical juncture. Some have criticized the slow pace at which pharmacogenomic interventions are entering routine clinical practice. These critics fail to appreciate the important advances made in understanding the genetic basis of drug response, an important prerequisite for using genetics to intervene. For example, genome-wide association studies focused on drug response are just now emerging, with only a handful published to date Citation[25]. The criticisms associated with genome-wide association studies for complex disease Citation[26] may be less relevant for many drug responses, where common variants may have more explanatory power Citation[27]. At the same time, our ability to sequence entire genes, exomes and even human genomes is giving us unprecedented access to rare variations, whose interpretation will be critical Citation[28]. The key underlying need is to move from the observation of an association to an understanding of the mechanism. Only with a mechanistic understanding can we find the causative common variants in genome-wide association studies, and only with mechanistic models can we determine which rare variants (and in which combinations) explain drug-response phenotypes. PharmGKB will continue to provide the platform to examine the relationships between variants and drug response, adding new tools as new data is gathered and disseminating it to researchers and educators.

Table 1. Relationships and exchange of information with other resources allows PharmGKB to focus on the subset of data relevant for pharmacogenomics.

Table 2. Top ten PharmGKB gene, drug, disease and pathway pages for 2009.

Acknowledgements

The authors would like to thank PharmGKB team members past and present without whom none of this would be possible: Dorit Berlin, John Conroy, Katrina Easton, Ray Fergerson, Li Gong, Mei Gong, Winston Gor, Joan Hebert, Tina Hernandez-Boussard, Micheal Hewett, Amy Hodge, Laura Hodges, Daniel Holbert, Mark Kiuchi, Steve Lin, Feng Liu, Xing Jian Lou, Charity Lu, Andrew MacBride, Diane Oliver, Connie Oshiro, Ryan Owen, Daniel Rubin, Katrin Sangkuhl, Farhad Shafa, Ravi Shankar, Rebecca Tang, TC Truong, Ryan Whaley, Michelle Whirl Carrillo, Mark Woon and Tina Zhou.

Financial & competing interests disclosure

This work is supported by the NIH/NIGMS (U01GM61374). The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

No writing assistance was utilized in the production of this manuscript.

Additional information

Funding

This work is supported by the NIH/NIGMS (U01GM61374). The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed. No writing assistance was utilized in the production of this manuscript.

Bibliography

  • Klein TE , ChangJT, ChoMK et al.: Integrating genotype and phenotype information: an overview of the PharmGKB project. Pharmacogenetics Research Network and Knowledge Base.Pharmacogenomics J.1(3) , 167–170 (2001).
  • Lin Z , OwenAB, AltmanRB: Genetics. Genomic research and human subject privacy.Science305(5681) , 183 (2004).
  • Rhead B , KarolchikD, KuhnRM et al.: The UCSC genome browser database: update 2010.Nucleic Acids Res.38(Database issue) , D613–D619 (2009).
  • Wishart DS , KnoxC, GuoAC et al.: DrugBank: a comprehensive resource for in silico drug discovery and exploration.Nucleic Acids Res.34(Database issue) , D668–D672 (2006).
  • Luciano JS : PAX of mind for pathway researchers.Drug Discov. Today10(13) , 937–942 (2005).
  • Giacomini KM , BrettCM, AltmanRB et al.: The pharmacogenetics research network: from SNP discovery to clinical drug response.Clin. Pharmacol. Ther.81(3) , 328–345 (2007).
  • Whirl-Carrillo M , WoonM, ThornCF, KleinTE, AltmanRB: An XML-based interchange format for genotype–phenotype data.Hum. Mutat.29(2) , 212–219 (2008).
  • Klein TE , AltmanRB, ErikssonN et al.: Estimation of the warfarin dose with clinical and pharmacogenetic data.N. Engl. J. Med.360(8) , 753–764 (2009).
  • Owen RP , AltmanRB, KleinTE: PharmGKB and the International Warfarin Pharmacogenetics Consortium: the changing role for pharmacogenomic databases and single-drug pharmacogenetics.Hum. Mutat.29(4) , 456–460 (2008).
  • Altman RB , FlockhartDA, SherryST, OliverDE, RubinDL, KleinTE: Indexing pharmacogenetic knowledge on the World Wide Web.Pharmacogenetics13(1) , 3–5 (2003).
  • Garten Y , AltmanRB: Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text.BMC Bioinformatics10(Suppl. 2) , S6 (2009).
  • Sangkuhl K , BerlinDS, AltmanRB, KleinTE: PharmGKB: understanding the effects of individual genetic variants.Drug Metab. Rev.40(4) , 539–551 (2008).
  • Yu W , NedR, WulfA, LiuT, KhouryMJ, GwinnM: The need for genetic variant naming standards in published abstracts of human genetic association studies.BMC Res. Notes2 , 56 (2009).
  • Owen RP , SangkuhlK, KleinTE, AltmanRB: Cytochrome P450 2D6.Pharmacogenet. Genomics19(7) , 559–562 (2009).
  • Hildebrandt M , AdjeiA, WeinshilboumR: et al.: Very important pharmacogene summary: sulfotransferase 1A1.Pharmacogenet. Genomics19(6) , 404–406 (2009).
  • Thorn CF , KleinTE, AltmanRB: PharmGKB summary: very important pharmacogene information for angiotensin-converting enzyme.Pharmacogenet. Genomics20(2) , 143–146 (2009).
  • Hodges LM , MarkovaSM, ChinnLW et al.: Very important pharmacogene summary: ABCB1 (MDR1, P-glycoprotein).Pharmacogenet. Genomics PMID: 20216335 (2010) (Epub ahead of print).
  • Wang L , PelleymounterL, WeinshilboumR et al.: Very important pharmacogene summary: thiopurine S-methyltransferase.Pharmacogenet. Genomics PMID: 20154640 (2010) (Epub ahead of print).
  • van Booven D , MarshS, McLeodH et al.: Cytochrome P450 2C9–CYP2C9.Pharmacogenet. Genomics. PMID: 20150829 20(4) , 277–281 (2010).
  • Oshiro C , ThornCF, RodenDM, KleinTE, AltmanRB: KCNH2 pharmacogenomics summary.Pharmacogenet. Genomics PMID: 20150828 (2010) (Epub ahead of print).
  • Medina MW , SangkuhlK, KleinTE, AltmanRB: PharmGKB: very important pharmacogene – HMGCR.Pharmacogenet. Genomics PMID: 20084049 (2010) (Epub ahead of print).
  • Owen RP , GongL, SagrieyaH, KleinTE, AltmanRB: VKORC1 pharmacogenomics summary.Pharmacogenet. Genomics PMID: 19940803 (2009) (Epub ahead of print).
  • Litonjua AA , GongL, DuanQL et al.: Very important pharmacogene summary ADRB2.Pharmacogenet. Genomics20(1) , 64–69 (2010).
  • Berlin DS , PersonMG, MittalA et al.: DNATwist: a web-based tool for teaching middle and high school students about pharmacogenomics.Clin. Pharmacol. Ther. (2010) (In Press).
  • Crowley JJ , SullivanPF, McLeodHL: Pharmacogenomic genome-wide association studies: lessons learned thus far.Pharmacogenomics10(2) , 161–163 (2009).
  • Goldstein DB : Common genetic variation and human traits.N. Engl. J. Med.360(17) , 1696–1698 (2009).
  • Nelson MR , BacanuSA, MostellerM et al.: Genome-wide approaches to identify pharmacogenetic contributions to adverse drug reactions.Pharmacogenomics J.9(1) , 23–33 (2009).
  • Tabone T : Mutations, structural variations, and genome-wide resequencing: where to from here in our understanding of disease and evolution?Hum. Mutat.29(6) , 886–890 (2008).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.