Filtering and analysing genomic variants

Genomic sequencing data needs to be processed through a series of manual and automated steps.


Sequencing a person's genome generates a huge amount of data.

Genome sequencing can be used to find out if people might have, or be at risk of, specific inherited conditions. Sequencing data needs to be processed and analysed to find differences (known as variants) which are relevant to the person's condition (their phenotype).

A human genome contains approximately 5 million variants to process and analyse. Most of these variants do not cause disease.

The first part of sequence data processing involves a series of automated steps to filter out low quality data and variants that are not relevant, such as those unlikely to be cause disease based on scientific evidence. 



View image on Flickr or Pinterest 


This leaves a much smaller set of up to a few hundred filtered variants, which are then manually analysed to find the variants that are most likely to cause or contribute to the person's condition. The manual step of analysing variants is usually performed by a genomicist or genome analyst – specialised scientists who study the genome and its effect on health. The remaining variants are then included in a report, usually for a clinician.



View image on Flickr or Pinterest 


During this process, the genome analyst will weigh up the scientific evidence available for each particular variant to decide whether it might be affecting the person's health. Specific questions that can help with this decision include: 

Is the variant likely to cause disease? 

A variant that has been shown to change the shape or function of the corresponding gene product is much more likely to be disease-causing than one that results in no change. 

Is the variant rare in the population? 

Most genetic conditions are relatively rare. It is unlikely that any variants that are common among healthy individuals are the cause of a rare disease in the affected individual. 

Does the person's phenotype/symptoms match? 

Phenotype refers to a person's observable characteristics – in this case, the patient's clinical signs and symptoms. If there is a record of other people with the same variant that have a similar phenotype, this is evidence that the variant could be the cause of their disease. 


Tags: infographic, image, DNA, variant, protein, bioinformatics, pipeline, variant filtering, phenotype, genetic disease


Thoughts about DNA Base? Take a few seconds to fill in our survey and let us know your feedback.


Kinghorn Centre for Clinical Genomics, October 2018. 
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.