|Dr Tudor Groza||Vinay Kulkarni||Edwin Zhang||Dr Chih-Hao Yu|
|Craig McNamara||Joshua Agudo||Dr Simon Kocbek||Dr Frank Lin|
Dr Tudor Groza
Tel: + 61 (0)2 9355 5717
Phenotype is fundamentally important to identifying the cause/origin of both rare and complex disorders, and substantially reducing the search-space for genomic variation. The major limitations on achieving an accurate and timely diagnosis stem from:
- the need to interpret and prioritise millions of variants per patient, and
- the current incomplete linkage of detailed phenotypic terms to genomic variants.
The combination of a detailed phenotype profile, acquired seamlessly and unobtrusively, with the corresponding clinical genomic data, can accelerate the identification of disease aetiology, facilitate disorder stratification and inform prognosis.
The Phenomics Team focuses on two core streams of research and development: phenotype analytics and community-driven knowledge curation.
Phenotype analytics encompasses the entire clinical phenotyping spectrum, from representation of clinical phenotypes using structured vocabularies to acquisition from unstructured data sources (e.g. electronic health records (EHRs), scientific literature or clinical reports) and from cross-species integration to decision making (e.g. disorder prediction or patient matchmaking). Two highlights of this stream are high precision phenotype concept recognition and deep phenotyping for empowering clinical interpretation from genomic data.
Phenotype concept recognition is particularly complex due to the variability in representation and lexical expression (i.e. use of a controlled or defined vocabulary) of phenotypes. Furthermore, the source of this description (e.g., EHRs, scientific literature, and clinical reports) provides additional challenges, as clinical symptoms may differ in their medical expression.
The team is actively developing a staged-pipeline for automatic recognition of phenotype descriptions – including orthogonal dimensions – such as degrees of severity or negation, and context-dependent phenotype-disorder associations. On the other hand, the quality and range of phenotype applications is limited by the quality and availability of the underlying data. Current limitations have arisen from the challenges of representing and acquiring more complex phenotypic dimensions. Furthermore, the interpretation of genomic data relies heavily on both rich, as well as temporal phenotypes. The team aims to support the process of clinical diagnosis by developing tools and techniques that enable the acquisition and manipulation of high quality, longitudinal phenotypic data.
Community-driven knowledge curation
The second stream revolves around the development of platforms to enable expert crowdsourcing of rare disorder knowledge. Meeting the needs of the translational community requires new approaches for managing input from a broad range of stakeholders. The curation platforms developed by the Phenomics Team propose combining a well-established editorial processes (including fine-grained micro-attributions) with visually appealing user interfaces to enable collaborative editing of disorder – phenotype – genotype knowledge.
The software developed by the Phenomics Team accelerates translational and clinical applications of genomic technologies through harmonising phenomic information and the intelligent distillation of its informative content. It also enables phenotypic analyses to provide a translational bridge from genome-scale biology to a patient-centered view on human disease pathogenesis.
An end-to-end solution for acquiring, managing and using phenotype data for clinical genomics.
The ecosystem includes a Web test request form, a patient data management platform (Patient Archive), a variant store built for large-scale data storage and retrieval (including low quality reference data from gVCFs), a pedigree editing tool and a clinical report generation facility. Demos and customisations are available on request.
A clinical grade phenotype-oriented patient data management platform combining the richness of the Human Phenotype Ontology with highly intuitive user interfaces to aid the discovery and decision-making process in the context of clinical genomics.
Patient Archive is the only platform that enables clinicians to use free text clinical notes for structured patient phenotyping, store the data in a secure manner (patient sensitive data is encrypted) and share the data via a fine-grained access control model. Furthermore, the platform provides support for intelligent analytics, focused on disease exploration, patient match-making and prescriptive phenotyping. The demo version of the latest release is available at http://patientarchive.org. Get in touch with us for a demo of the latest development version.
A unique solution for high precision phenotype concept recognition using the Human Phenotype Ontology.
The underlying technique has been built to cater for the high lexical variability associated with clinical phenotypes, as well as for decomposing coordinated terms or detecting non-canonical phenotypes. The concept recognizer is part of the Patient Archive and serves other platforms, such as University of Washington’s MyGene2 (https://www.mygene2.org/MyGene2/#/about) or Baylor’s OMIM Explorer (http://omimexplorer.research.bcm.edu:3838/omim_explorer/). It has also been used to generate the first HPO annotation dataset for common diseases – available via our Pubmed Browser (http://pubmed-browser.human-phenotype-ontology.org/). The HPO CR package is freely available for academic use on request.
Coming soon: Journal manuscript annotator – for creating structured pheno-packets
An innovative platform for curating domain knowledge, focused on intuitive and friendly user interfaces and workflow-based knowledge curation and acquisition.
The system will soon be available for public use in the context of the Orphanet data curation initiative.
- Groza T, Köhler S, Moldenhauer D, Vasilevsky N, Baynam G, Zemojtel T, Schriml LM, Kibbe WA, Schofield PN, Beck T, Vasant D, Brookes AJ, Zankl A, Washington NL, Mungall CJ, Lewis SE, Haendel MA, Parkinson H, Robinson PN. The Human Phenotype Ontology: Semantic Unification of Common and Rare Disease. American Journal of Human Genetics, 97(1):111-24, 2015
- Oellrich A, Collier N, Groza T, Rebholz-Schuhmann D, Shah N, Bodenreider O, Boland MR, Georgiev I, Liu H, Livingston K, Luna A, Mallon AM, Manda P, Robinson PN, Rustici G, Simon M, Wang L, Winnenburg R, Dumontier M. The digital revolution in phenotyping. Brief Bioinform. 2015 Sep 29. pii: bbv083.
- Groza T, Köhler S, Doelken S, Collier N, Oellrich A, Smedley D, Couto FM, Baynam G, Zankl A, Robinson PN. Automatic concept recognition using the human phenotype ontology reference and test suite corpora. Database (Oxford). 2015 Feb 27;2015. pii: bav005.
- Mungall CJ, Washington NL, Nguyen-Xuan J, Condit C, Smedley D, Köhler S, Groza T, Shefchek K, Hochheiser H, Robinson PN, Lewis SE, Haendel MA. Use of model organism and disease databases to support matchmaking for human disease gene discovery. Hum Mutat. 2015 Oct;36(10):979-84.
- Baynam G, Walters M, Claes P, Kung S, LeSouef P, Dawkins H, Bellgard M, Girdea M, Brudno M, Robinson P, Zankl A, Groza T, Gillett D, Goldblatt J. Phenotyping: targeting genotype's rich cousin for diagnosis. J Paediatr Child Health. 2015 Apr;51(4):381-6.
- Groza T, Verspoor K. Assessing the impact of case sensitivity and term information gain on biomedical concept recognition. PLoS One. 2015 Mar 19;10(3):e0119091.
- Paul R, Groza T, Hunter J, Zankl A. Inferring characteristic phenotypes via class association rule mining in the bone dysplasia domain. J Biomed Inform. 2014 Apr;48:73-83.
- Collier N, Oellrich A, Groza T. Toward knowledge support for analysis and interpretation of complex traits. Genome Biol. 2013;14(9):214.
- Groza T, Hunter J, Zankl A. Mining skeletal phenotype descriptions from scientific literature. PLoS One. 2013;8(2):e55656.