Article | Published:

Phrank measures phenotype sets similarity to greatly improve Mendelian diagnostic disease prioritization

Genetics in Medicine (2018) | Download Citation

Subjects

Abstract

Purpose

Exome sequencing and diagnosis is beginning to spread across the medical establishment. The most time-consuming part of genome-based diagnosis is the manual step of matching the potentially long list of patient candidate genes to patient phenotypes to identify the causative disease.

Methods

We introduce Phrank (for phenotype ranking), an information theory–inspired method that utilizes a Bayesian network to prioritize candidate diseases or genes, as a stand-alone module that can be run with any underlying knowledgebase and any variant filtering scheme.

Results

Phrank outperforms existing methods at ranking the causative disease or gene when applied to 169 real patient exomes with Mendelian diagnoses. Phrank’s greatest improvement is in disease space, where across all 169 patients it ranks only 3 diseases on average ahead of the true diagnosis, whereas Phenomizer ranks 32 diseases ahead of the causal one.

Conclusions

Using Phrank to rank all patient candidate genes or diseases, as they start working through a new case, will save the busy clinician much time in deriving a genetic diagnosis.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    Iglesias A, Anyane-Yeboa K, Wynn J, et al. The usefulness of whole-exome sequencing in routine clinical practice. Genet Med. 2014;16:922–31. https://doi.org/10.1038/gim.2014.58

  2. 2.

    Yang Y, Muzny DM, Reid JG, et al. Clinical whole-exome sequencing for the diagnosis of Mendelian disorders. N Engl J Med. 2013;369:1502–11. https://doi.org/10.1056/NEJMoa1306555

  3. 3.

    Lee H, Deignan JL, Dorrani N, et al. Clinical exome sequencing for genetic identification of rare Mendelian disorders. JAMA. 2014;312:1880–7. https://doi.org/10.1001/jama.2014.14604

  4. 4.

    Ng SB, Bigham AW, Buckingham KJ, et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat Genet. 2010;42:790–3. https://doi.org/10.1038/ng.646

  5. 5.

    Ng SB, Turner EH, Robertson PD, et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature. 2009;461:272–6. https://doi.org/10.1038/nature08250

  6. 6.

    Ng SB, Buckingham KJ, Lee C, et al. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet. 2010;42:30–35. https://doi.org/10.1038/ng.499

  7. 7.

    Wenger AM, Guturu H, Bernstein JA, Bejerano G. Systematic reanalysis of clinical exome data yields additional diagnoses: implications for providers. Genet Med. 2017;19:209–14. https://doi.org/10.1038/gim.2016.88

  8. 8.

    Amberger J, Bocchini C, Hamosh A. A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®). Hum Mutat. 2011;32:564–7. https://doi.org/10.1002/humu.21466

  9. 9.

    Rath A, Olry A, Dhombres F, et al. Representation of rare diseases in health information systems: the Orphanet approach to serve a wide range of end users. Hum Mutat. 2012;33:803–8. https://doi.org/10.1002/humu.22078

  10. 10.

    Köhler S, Doelken SC, Mungall CJ, et al. The Human Phenotype Ontology Project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 2014;42:D966–74. (Database issue)

  11. 11.

    Dewey FE, Grove ME, Pan C, et al. Clinical interpretation and implications of whole-genome sequencing. JAMA. 2014;311:1035–45. https://doi.org/10.1001/jama.2014.1717

  12. 12.

    1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. https://doi.org/10.1038/nature11632

  13. 13.

    Lek M, Karczewski KJ, Minikel EV, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–91. https://doi.org/10.1038/nature19057

  14. 14.

    Taylor JC, Martin HC, Lise S, et al. Factors influencing success of clinical genome sequencing across a broad spectrum of disorders. Nat Genet. 2015;47:717–26. https://doi.org/10.1038/ng.3304

  15. 15.

    Church G. Compelling reasons for repairing human germlines. N Engl J Med. 2017;377:1909–11. https://doi.org/10.1056/NEJMp1710370

  16. 16.

    Ng PC, Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31:3812–4.

  17. 17.

    Adzhubei I, Jordan DM, Sunyaev SR. Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet. 2013;Chapter 7:Unit7.20. https://doi.org/10.1002/0471142905.hg0720s76

  18. 18.

    Kircher M, Witten DM, Jain P, et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46:310–5. https://doi.org/10.1038/ng.2892

  19. 19.

    Jagadeesh KA, Wenger AM, Berger MJ, et al. M-CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity. Nat Genet. 2016;48:1581–6. https://doi.org/10.1038/ng.3703

  20. 20.

    Singleton MV, Guthery SL, Voelkerding KV, et al. Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am J Hum Genet . 2014;94:599–610. https://doi.org/10.1016/j.ajhg.2014.03.010

  21. 21.

    Zemojtel T, Köhler S, Mackenroth L, et al. Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genome. Sci Transl Med. 2014;6:252ra123–252ra123. https://doi.org/10.1126/scitranslmed.3009262

  22. 22.

    Köhler S, Schulz MH, Krawitz P, et al. Clinical diagnostics in human genetics with semantic similarity searches in ontologies. Am J Hum Genet. 2009;85:457–64. https://doi.org/10.1016/j.ajhg.2009.09.003

  23. 23.

    Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques—Adaptive Computation and Machine Learning. The MIT Press; 2009. Cambridge, MA

  24. 24.

    Deciphering Developmental Disorders Study. Large-scale discovery of novel genetic causes of developmental disorders. Nature. 2015;519:223–8. https://doi.org/10.1038/nature14135

  25. 25.

    Lappalainen I, Almeida-King J, Kumanduri V, et al. The European Genome-Phenome Archive of human data consented for biomedical research. Nat Genet. 2015;47:692–5. https://doi.org/10.1038/ng.3312

  26. 26.

    Wright CF, Fitzgerald TW, Jones WD, et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet. 2015;385:1305–14. https://doi.org/10.1016/S0140-6736(14)61705-0

  27. 27.

    Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164. https://doi.org/10.1093/nar/gkq603

  28. 28.

    Aken BL, Ayling S, Barrell D, et al. The Ensembl gene annotation system. Database. 2016;baw093. https://doi.org/10.1093/database/baw093

  29. 29.

    Flicek P, Amode MR, Barrell D, et al. Ensembl 2014. Nucleic Acids Res. 2014;42(D1):D749–55. https://doi.org/10.1093/nar/gkt1196

  30. 30.

    Jagadeesh KA, Wu DJ, Birgmeier JA, et al. Deriving genomic diagnoses without revealing patient genomes. Science. 2017;357:692–5. https://doi.org/10.1126/science.aam9710

  31. 31.

    Smedley D, Jacobsen JOB, Jäger M, et al. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nat Protoc. 2015;10:2004–15. https://doi.org/10.1038/nprot.2015.124

  32. 32.

    Yang H, Robinson PN, Wang K. Phenolyzer: phenotype-based prioritization of candidate genes for human diseases. Nat Methods. 2015;12:841–3. https://doi.org/10.1038/nmeth.3484

  33. 33.

    Sifrim A, Popovic D, Tranchevent L-C, et al. eXtasy: variant prioritization by genomic data fusion. Nat Methods. 2013;10:1083–4. https://doi.org/10.1038/nmeth.2656

Download references

Acknowledgements

We thank Yosuke Tanigawa, Ethan Dyer, Golan Yona, and all other members of the Bejerano Lab for valuable discussions and project feedback. We would also like to thank the European Genome-Phenome Archive (EGA) and the Deciphering Developmental Diseases (DDD) project. The DDD study presents independent research commissioned by the Health Innovation Challenge Fund (grant number HICF-1009-003), a parallel funding partnership between the Wellcome Trust and the Department of Health, and the Wellcome Trust Sanger Institute (grant number WT098051). The views expressed in this publication are those of the author(s) and not necessarily those of the Wellcome Trust or the Department of Health. The study has UK Research Ethics Committee approval (10/H0305/83, granted by the Cambridge South REC, and GEN/284/12 granted by the Republic of Ireland REC). The research team acknowledges the support of the National Institute for Health Research, through the Comprehensive Clinical Research Network. as well as the patients and professionals involved in the Deciphering Developmental Disorders (DDD) study deposited in the European Genome Archive (EGA). This work was funded in part by the Stanford Graduate Fellowship and CEHG Fellowship to K.A.J., a Bio-X Stanford Interdisciplinary Graduate Fellowship to J.B., the Stanford Pediatrics Department, DARPA, a Packard Foundation Fellowship, and a Microsoft Faculty Fellowship to G.B.

Author information

Author notes

  1. These authors contributed equally: Karthik A. Jagadeesh, Johannes Birgmeier

Affiliations

  1. Department of Computer Science, Stanford University, Stanford, California, 94305, USA

    • Karthik A. Jagadeesh MSc
    • , Johannes Birgmeier MSc
    • , Cole A. Deisseroth
    •  & Gill Bejerano PhD
  2. Department of Pediatrics, Stanford University, Stanford, California, 94305, USA

    • Harendra Guturu PhD
    • , Aaron M. Wenger PhD
    • , Jonathan A. Bernstein MD, PhD
    •  & Gill Bejerano PhD
  3. Department of Developmental Biology, Stanford University, Stanford, California, 94305, USA

    • Gill Bejerano PhD

Authors

  1. Search for Karthik A. Jagadeesh MSc in:

  2. Search for Johannes Birgmeier MSc in:

  3. Search for Harendra Guturu PhD in:

  4. Search for Cole A. Deisseroth in:

  5. Search for Aaron M. Wenger PhD in:

  6. Search for Jonathan A. Bernstein MD, PhD in:

  7. Search for Gill Bejerano PhD in:

Disclosure

The authors declare no conflicts of interest.

Corresponding author

Correspondence to Gill Bejerano PhD.

Electronic supplementary material

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/s41436-018-0072-y