Drawing the correct conclusions from clusters of data

There’s a science to analysing science, say Australian researchers, and some common methods of analysis can lead to completely incorrect study conclusions. In the life sciences, many experiments rely on ‘clustered data’, information collected in groups. Sometimes it’s a group of patients, sometimes tissue samples, sometimes experiments undertaken at different times.
Drawing the correct conclusions from clusters of data
Media Release: 18 August 2010

There’s a science to analysing science, say Australian researchers, and some common methods of analysis can lead to completely incorrect study conclusions.

In the life sciences, many experiments rely on ‘clustered data’, information collected in groups. Sometimes it’s a group of patients, sometimes tissue samples, sometimes experiments undertaken at different times.

Often researchers do not recognize that they have clustered data, or else do not understand the subtleties and complexities involved in correctly analysing it. As a consequence, the wrong analysis is sometimes applied and the conclusions of important studies can be wrong.

Drs Bryce Vissel and James Daniel, neuroscientists from Sydney’s Garvan Institute of Medical Research and Dr Sally Galbraith, statistician from the University of NSW, have published a new study of clustered data, and provided approaches for its correct analysis, in the Journal of Neuroscience, now online.

“We believe this paper should encourage some studies to be revisited,” said project leader Dr Vissel. “In some cases, analysis of the data by the methods we recommend could lead to different conclusions.”

“The science of statistics is interested in how close to truth you can get when you are actually dealing with random variability. Statistics tries to make sense of randomness and extract trends. If the analysis method is wrong, the conclusions may be biased.” 

“Sometimes data is inherently grouped by virtue of the way it is collected. For example, we might want to study the effect of a drug by comparing 100 patients who received the drug with 100 patients who received a sugar pill.  However if 140 of those patients have been treated at Hospital X, and 60 at Hospital Y, our analysis shows that you can’t simply compare drug versus sugar pill. You must reflect the fact that people were treated at different locations in your analysis. Patients might also be grouped by treating physician, other complications, other medications, and so on.” 

We’re saying that it’s important to be absolutely clear, right from the initial design of an experiment, exactly what conclusions you are able to draw from the data you aim to collect, and what indications of bias you may need to consider.”

“Our interest in this topic came about when we were investigating the synaptic mechanisms that regulate the release of the neurotransmitter dopamine from nerve cells in the brain.”

“Dopamine helps regulate our movement, making it streamlined and steady. People with Parkinson’s Disease gradually lose dopamine producing nerve cells, leading to muscle rigidity and tremor.”

“Our goal was to describe release mechanisms very precisely, to aid the development of drugs for mimicking those mechanisms.”

“We were looking at different ‘clusters’ of information over time – different nerve cells taken from different animals, subjected to different kinds of tests and measurements.”

“It took us a long time to figure out how to reach the most reliable – and repeatable – conclusions. After struggling for some months, we came up with an appropriate and new technique for the neurosciences.”

Dr Sally Galbraith a statistician at UNSW was part of the team that developed the model of clustered data and investigated a range of methods for its analysis. "We believe that this study will make researchers think more about their data and the way it should be analysed," said Dr Galbraith.

"Collection and analysis of data is typically very costly and time consuming. Scientists have an obligation, therefore, to make sure their experiments are well designed, that they collect the right data, and that they apply appropriate statistical techniques."

 

 

ACKNOWLEDGMENTS
This research was made possible by a a Project Grant under the NSW Spinal Cord Injury & Related Neurological Conditions Research Grants Program, administered by the Office for Science and Medical Research of the State Government of NSW, through a NSW State Government's BioFirst Award and by the support of Bill and Laura Gruy, Mr and Mrs Dixon, and Amadeus Energy Ltd, an oil and gas producer and explorer based in Perth, Western Australia.

Related Labs/Groups

Related People