Phenomics Team

Team Members

TudorVinay KulkarniEdwin ZhangChih-Hao Yu
Dr Tudor Groza Vinay Kulkarni Edwin Zhang Dr Chih-Hao Yu
Craig McNamara Joshua Agudo Simon Kocbek Dr Frank Lin
Craig McNamara Joshua Agudo Dr Simon Kocbek Dr Frank Lin


Sindhuja Selvam

Background                                                                             

Core directions

Software

Contact

Dr Tudor Groza

Email: t.groza@garvan.org.au

Tel: + 61 (0)2 9355 5717

 

Background

Phenotype is fundamentally important to identifying the cause/origin of both rare and complex disorders, and substantially reducing the search-space for genomic variation. The major limitations on achieving an accurate and timely diagnosis stem from:

  • the need to interpret and prioritise millions of variants per patient, and
  • the current incomplete linkage of detailed phenotypic terms to genomic variants.

The combination of a detailed phenotype profile, acquired seamlessly and unobtrusively, with the corresponding clinical genomic data, can accelerate the identification of disease aetiology, facilitate disorder stratification and inform prognosis.

Core directions

The Phenomics Team focuses on two core streams of research and development: phenotype analytics and community-driven knowledge curation.

Phenotype analytics

Phenotype analytics encompasses the entire clinical phenotyping spectrum, from representation of clinical phenotypes using structured vocabularies to acquisition from unstructured data sources (e.g. electronic health records (EHRs), scientific literature or clinical reports) and from cross-species integration to decision making (e.g. disorder prediction or patient matchmaking). Two highlights of this stream are high precision phenotype concept recognition and deep phenotyping for empowering clinical interpretation from genomic data.

Phenotype concept recognition is particularly complex due to the variability in representation and lexical expression (i.e. use of a controlled or defined vocabulary) of phenotypes. Furthermore, the source of this description (e.g., EHRs, scientific literature, and clinical reports) provides additional challenges, as clinical symptoms may differ in their medical expression.

The team is actively developing a staged-pipeline for automatic recognition of phenotype descriptions – including orthogonal dimensions – such as degrees of severity or negation, and context-dependent phenotype-disorder associations. On the other hand, the quality and range of phenotype applications is limited by the quality and availability of the underlying data. Current limitations have arisen from the challenges of representing and acquiring more complex phenotypic dimensions. Furthermore, the interpretation of genomic data relies heavily on both rich, as well as temporal phenotypes. The team aims to support the process of clinical diagnosis by developing tools and techniques that enable the acquisition and manipulation of high quality, longitudinal phenotypic data.

Community-driven knowledge curation

The second stream revolves around the development of platforms to enable expert crowdsourcing of rare disorder knowledge. Meeting the needs of the translational community requires new approaches for managing input from a broad range of stakeholders. The curation platforms developed by the Phenomics Team propose combining a well-established editorial processes (including fine-grained micro-attributions) with visually appealing user interfaces to enable collaborative editing of disorder – phenotype – genotype knowledge.

Software

The software developed by the Phenomics Team accelerates translational and clinical applications of genomic technologies through harmonising phenomic information and the intelligent distillation of its informative content. It also enables phenotypic analyses to provide a translational bridge from genome-scale biology to a patient-centered view on human disease pathogenesis.

Phenotype-focused clinical genomics ecosystem

An end-to-end solution for acquiring, managing and using phenotype data for clinical genomics.

                              Phenotype-focused clinical genomics ecosystem

The ecosystem includes a Web test request form, a patient data management platform (Patient Archive), a variant store built for large-scale data storage and retrieval (including low quality reference data from gVCFs), a pedigree editing tool and a clinical report generation facility. Demos and customisations are available on request.

Patient Archive

A clinical grade phenotype-oriented patient data management platform combining the richness of the Human Phenotype Ontology with highly intuitive user interfaces to aid the discovery and decision-making process in the context of clinical genomics. 

                                  Patient Archive

Patient Archive is the only platform that enables clinicians to use free text clinical notes for structured patient phenotyping, store the data in a secure manner (patient sensitive data is encrypted) and share the data via a fine-grained access control model. Furthermore, the platform provides support for intelligent analytics, focused on disease exploration, patient match-making and prescriptive phenotyping. The demo version of the latest release is available at http://patientarchive.org. Get in touch with us for a demo of the latest development version.

High precision phenotype concept recognition 

A unique solution for high precision phenotype concept recognition using the Human Phenotype Ontology.

                                      Community-driven knowledge curation 

The underlying technique has been built to cater for the high lexical variability associated with clinical phenotypes, as well as for decomposing coordinated terms or detecting non-canonical phenotypes. The concept recognizer is part of the Patient Archive and serves other platforms, such as University of Washington’s MyGene2 (https://www.mygene2.org/MyGene2/#/about) or Baylor’s OMIM Explorer (http://omimexplorer.research.bcm.edu:3838/omim_explorer/). It has also been used to generate the first HPO annotation dataset for common diseases – available via our Pubmed Browser (http://pubmed-browser.human-phenotype-ontology.org/). The HPO CR package is freely available for academic use on request.

Coming soon: Journal manuscript annotator – for creating structured pheno-packets

Community-driven knowledge curation

An innovative platform for curating domain knowledge, focused on intuitive and friendly user interfaces and workflow-based knowledge curation and acquisition.

                                       Community-driven knowledge curation

The system will soon be available for public use in the context of the Orphanet data curation initiative.

Selected Publications