How we can better ‘mine’ our genome for information

New sequencing technologies are showing that structural change in the genome has a much greater impact on how we interpret the epigenome – the extra layer of information above the genome – than previously thought, say Garvan researchers.
How we can better ‘mine’ our genome for information
Media Release: 04 November 2010

New sequencing technologies are showing that structural change in the genome has a much greater impact on how we interpret the epigenome – the extra layer of information above the genome – than previously thought, say Australian researchers.

When we’re born, our DNA structure is in order, normally with two copies of each gene, one from mum and one from dad. As we grow older, and particularly when we get cancer, the actual backbone of our genetic material changes. We get genetic mutations, deletions, insertions and rearrangements – sometimes of large sections of the genome.

In the burgeoning field of bioinformatics, ever-growing numbers of scientists are faced with the task of finding truth or meaning in the numbers that represent these changes.

Bioinformaticians ‘mine’ data, forming mathematical questions or algorithms that will help them interpret millions of numbers in a way that properly reflects the biology.

Dr Mark Robinson and Professor Susan Clark from Sydney’s Garvan Institute of Medical Research, have found that the exquisite level of detail being produced by the newest sequencers demands a re-examination of existing data, as well as the development of radically new approaches. Their findings are published in Genome Research, now online.

“Our cancer genome is actually in great disarray,” said project leader Professor Clark.

“In the past, we have been able to examine chromosomes under electron microscopes and see this disorganisation.  With new genome sequencing technologies, you can see what’s taking place at a much higher resolution.”

“This is telling us that mistakes are much more common than had been thought up until now, and our new data shows that these mistakes impact on our interpretation of the epigenetic layers.”

“We’re generating a huge amount of data about the genome sequence, as well as the next epigenetic layers that occur above the genetic sequence – the biochemical changes that take place affecting expression of genes.”

Those biochemical changes include the amount of ‘methylation’ (methyl groups attaching to DNA and changing how much of a gene is expressed) and numbers of ‘histone marks’ (the proteins that form the apparent ‘beads on a string’ in DNA).

Clark and colleagues asked the question “Do genetic alterations actually make a difference to the way we should interpret DNA methylation or histone marks”?

The answer was a resounding “Yes”.

“Lo and behold we found that, yes indeed, many of the changes that appear to be observed as epigenetic changes are just a readout of the underlying genetic sequence alterations,” Clark observed.

“I suppose you could say sometimes the data is a mirror image of the underlying genetic sequence. If there’s more DNA, then it looks like there’s more methylation, and vice versa.”

“If a DNA sequence has been deleted, we would have interpreted that in the past as unmethylated.”

“Integrating the data provided by new technologies is proving to be an enormous challenge and the scientists need to understand what the pitfalls are. Our paper is breaking new ground in providing both genetic and epigenetic researchers with a more accurate way of interpreting and reinterpreting the minefield of complex next generation sequencing data.”

Related Labs/Groups

Related People