Prof John Shine's Research Discoveries

Prof John Shine's Research Discoveries

John ShineProfessor Shine became world-renowned for a series of discoveries he made between 1975 and 1985 that furthered our understanding of how genes are turned into the proteins that do the work in cells. He cloned the first human hormone genes, and in the process developed sophisticated gene cloning techniques that helped transform the world of biotechnology.

Between 1985 and 1990, he worked within the biotech industry, first with California Biotechnology in San Francisco, as its Chief Scientific Officer and ultimately its President. In 1987, he set up an Australian arm in Sydney, Pacific Biotechnology. At the same time, he started to work with Garvan, becoming Executive Director in 1990.

Passion for science continued to drive John, and he set up what was to become the Neuroscience Program at Garvan. He still runs his own lab, investigating olfactory stem cells – cells, taken from the lining of the nose, that have the potential to differentiate into any kind of neuron.

To put the above in context, what follows is an explanation of some of the science and the thinking that prompted it. There are also sound files, where you can listen to John Shine tell parts of the story in his own words.

The Ribosome

Ribosome DiagramTo understand this story you must first understand the ribosome – a tiny structure inside cells that translates the information encoded in our genes into the proteins that carry out functions in our bodies.

When a gene is 'expressed', a copy is made of its DNA, and then released from the nucleus into the body of a cell. The copied gene is called 'messenger RNA'. Cells contain thousands of messenger RNAs at any one time, each with the potential to be converted into a specific protein by a ribosome.

The resulting protein might be a hormone, like insulin, or a blood-clotting enzyme, like thrombin.

The ribosome assembles the protein by dragging a messenger RNA through an interface at its centre, allowing each piece of genetic information to attract a specific building block, or amino acid, delivered to the interface by another molecule known as a 'transfer RNA'.

RNA consists of a string of four different kinds of molecules known as 'nucleotides': adenine (A), cytosine (C), guanine (G) and uracil (U). Genetic information is parcelled in 'codons' comprising 3 nucleotides – such as CUG or GAG. These bind strongly with their chemical complements. Adenine binds with uracil (A with U) and cytosine binds with guanine (C with G).

If the messenger RNA contains the codon GAG (specifying glutamic acid) , it will bind at the ribosomal interface with the complementary codon CUC, delivered by the glutamic acid transfer RNA. That will result in a glutamic acid being added to the protein chain. If the next messenger RNA codon is CUG (leucine), then a CAG transfer RNA (with its attached leucine) will bind with it and transfer the amino acid leucine to the growing protein chain.

The desired protein continues to be formed in this way, and when complete, the ribosome clips its end, like a pair of scissors snipping a ribbon, and sends it out into the cell to do its work.

John Shine's Honours Project - Studies on Insect Ribosomal RNA

Insect Mammal DiagramJohn Shine studied molecular biology at the Australian National University (ANU) in Canberra under the guidance of an inspirational teacher, Lynn Dalgarno.

When the time came for John to do his honours project in 1971, Dalgarno suggested a topic that drew his student into the world of ribosomes.

It was already known that a ribosome is made up of two sub-units – one large, one small, with the translational interface located in-between. Each sub-unit contains its own strand of ribosomal RNA (26S in the large sub-unit, 18S in the small). This RNA was thought of as 'scaffolding' to hold together the dozens of proteins that formed the ribosome.

When extracting insect ribosomal RNA and mammalian ribosomal RNA from cells, Dalgarno had noticed that certain insects' 26S ribosomal RNA seemed to separate into two halves when heated. Thinking the break might be significant, he asked John to investigate the phenomenon further.

John undertook a comprehensive study of ribosomal RNA in a range of different insects and showed that all insect 26S ribosomal RNAs examined (with the curious exception of aphids) had a hidden break towards the centre of the molecule. These experiments were helped  by the fact that the CSIRO Division of Entomology was close by and the laboratory had easy access to a variety of insects.  

This work on the physico-chemical properties of insect ribosomal RNA led to a publication in the Journal of Molecular Biology (Shine and Dalgarno, 75, 57-72, 1973).

John Shine's PhD Project - Sequence Studies on the 3'-ends of Ribosomal RNA

During John Shine's Honours year, John Hunt, a past colleague of Lynn Dalgarno's from London, arrived for a short sabbatical from the University of Hawaii. He showed John Shine his technique for labelling the 3'-terminus of RNA molecules with radioisotopes and how to use step-wise degradation of nucleotides to determine the end sequence of the RNA. As a PhD student, John used the Hunt procedure to determine the sequence at the ends of the 18S small ribosomal RNAs from Drosophila and yeast.

In those days, sequencing RNA was a laborious procedure, and it took a week or so to determine the sequence of only a few nucleotides. John found that the sequence for both Drosophila and yeast was GAUCAUUA-3'OH. This sequence was identical to the sequence for rabbit ribosomal RNA previously shown by Hunt. The fact that such widely different organisms had identical sequences was surprising and suggested that this part of the ribosomal RNA may have an important function. But it was not at all obvious what this function might be.

However, it came to Lynn Dalgarno that this conserved sequence contained the complement of each of the three termination triplets UAA, UAG and UGA, which meant that base pairing may occur between ribosomal RNA and messenger RNA to facilitate the termination of protein synthesis. This was a eureka moment for Lynn and John, and led to the publication of a paper in Nature which first proposed a direct base-pairing role for rRNA in protein synthesis.

For John and Lynn, there was now a massive change in their thinking on how protein synthesis might occur. Up to this point ribosomal RNA had been thought of solely as a structural support for a number of important proteins.

Bacteria Do It Differently

Termination DiagramJohn then turned to sequencing bacterial ribosomal RNAs. In the realm of termination, would they behave in the same way as the ribosomal RNAs of higher life forms?

He found that, unlike human and insect 18S ribosomal RNA, the bacterial equivalent did not end with the genetic sequence GAUCAUUA. Instead, it inserted a CCUCC before the final UUA.

Scientists knew at the time that bacteria exhibited a phenomenon know as 'suppression', an efficient way of using limited DNA sequences to generate additional multiple proteins. Although bacteria recognized the same termination codons as all other life forms - UAA, UAG and UGA - they could 'read through' these stop codons in some instances to produce extended proteins with different functions.

The efficiency of termination depended on which termination codon the bacteria employed for a particular protein. If UAA, there was 100% termination - because UAA is the perfect complement to UUA at the 3 prime end of ribosomal RNA. UAG will also bind with UUA, but with weaker affinity, resulting in termination of around 90% efficiency. Less strong still is UGA, pairing with the UCA upstream of CCUCC around 70% of the time.

Percentages Diagram

So termination of protein synthesis in bacteria, while not identical, had similarities to termination in mammals and insects, its differences consistent with the demands of suppression. John turned his attention to the extra CCUCC sequence, which was perplexing. In his words, "it seemed as if, in bacteria, this extra bit had been stuck in the middle of what was a wholly conserved thing".

While he could see that CCUCC was clearly important in some way, it took a lateral leap for him to understand its true significance.

Then it dawned on him – initiation and termination of protein synthesis both occur in the same region of the ribosome! CCUCC in bacterial ribosomal RNA probably had little to do with termination. Instead, it probably determined the efficiency of initiation – how bacteria decided when to start making a protein, and how much of that protein to make.

Shine-Dalgarno Sequence

"In the fields of observation, chance favours only the prepared mind."
Louis Pasteur

John Shine at Board

In humans and insects, one gene produces one messenger RNA at a time, which in turn produces one protein at a time. In bacteria, messenger RNA encodes many genes, with multiple AUGs (start codons) appearing along its length.

So how on earth does a bacterial ribosome recognise where to bind and start making proteins when faced with a messenger RNA containing many genes – that is, many AUGs, or start codons?

Not only that, AUG codes for methionine (a common amino acid) and so regularly appears within a gene. This creates the potential for many false start points along a string of messenger RNA holding both AUGs as start codons and AUGs as internal methionines.

What other scientists had done in an attempt to answer this question was start protein synthesis in a test tube by isolating ribosomes and messenger RNA from bacteria, along with transfer RNAs for methionine alone.

By undertaking this experiment, several research groups had been able to sequence tiny sections of messenger RNA (around 20 nucleotides long) near various AUGs – where ribosomes had taken only the very first step in building a protein.

Around a dozen sequences had been published, but so far no-one had been able to make much sense of them. When John realised that CCUCC might have something to do with the initiation process, he raced to look at the published sequences. This was truly a case of chance favouring a prepared mind.

John knew exactly what he was looking for. He realised that if he could find GGAGG – the complement of CCUCC – in the initiation sequences, he might have cracked a code. And sure enough, as anticipated he found GGAGG repeatedly, or else part of the sequence, appearing slightly upstream from a number of AUGs.

"This was the Shine-Dalgarno sequence. Nothing to do with termination... initiation was the big deal! Because how do you regulate gene expression? How do you make ten of these versus one of those, versus six of those? But I could immediately see that there was an amazing correlation between the amount of protein that was made - you know, how well the ribosome initiated synthesis - and the complementarity between that [CCUCC] and that [the initiation sequence]," said John.

"So if you had GGAGG - the full complement – that was one of your most highly expressed genes. If you just had AGG or GGA, three of the five, it [the gene] was expressed, but it was one of the minor ones. If you had something in between, you had something in between. I could make a perfect correlation, and that's what got everyone excited."

Published in the journal Proceedings of the National Academy of Sciences in 1975 (go to PNAS), GGAGG became known as the Shine-Dalgarno sequence. It predicted whether or not a protein would be expressed in bacteria, and with what efficiency it would be expressed. A much more comprehensive discussion of the Shine-Dalgarno sequence was later published in Nature (Go to Nature paper).

A landmark finding, today's sophisticated genome annotation software still uses the sequence to help identify start points of protein synthesis.

Post-Doctoral Work – Cloning of Human Genes and Recombinant DNA

In 1975, John Shine went to the University of San Francisco (UCSF) to do post-doctoral work in the newly emerging fields of restriction enzymes and recombinant DNA. Over the coming two and a half years, he would develop many gene cloning techniques and would clone the first human hormone genes.

Restriction Enzymes 
Produced by bacteria to break down DNA, restriction enzymes are proteins that act as biochemical scissors. They recognise very specific DNA sequences, severing DNA molecules at those precise locations, or ‘restriction sites’.

Thought to have evolved to protect bacteria against invading viruses, bacteria use restriction enzymes in much the same way as humans use antibodies. When the enzyme spots its target restriction site in the DNA of a virus – say GAATTC – it snips the DNA at both ends of the sequence, disabling the virus.

To protect its own DNA from being chopped up, bacteria attach methyl groups to restriction sites. This 'methylation' also protects the bacteria while it is replicating.

Restriction enzymes allow researchers to isolate gene-containing fragments from DNA - human, butterfly, bacterial or whatever – in a very targeted way. They can then recombine those isolated fragments with other DNA molecules, including DNA from other species, producing 'recombinant DNA'.

Sticky Ends Large DiagramPlasmid Small DiagramPlasmids and 'Sticky Ends'

Plasmids, small self-replicating fragments of non-chromosomal DNA found in bacteria, have proven to be very useful tools for genetic researchers. Human genes can be isolated by restriction enzymes, then merged with plasmids and cloned in bacteria.

Some restriction enzymes cut their target code in a particular way, leaving it with a piece of single-stranded DNA that juts out at each side, little jagged overhangs that are known as 'sticky ends'.

Sticky ends are particularly attracted to their complements, other sticky ends cut by the same restriction enzyme.

So when a restriction enzyme is used to chop human DNA containing a useful gene and as well as plasmid DNA, the sticky ends of the resulting genetic material mesh together firmly, forming many DNA strands. These will self-replicate many times inside bacteria.

Cloning Human Genes

When John Shine arrived in San Francisco, gene cloning was still a dream, and three groups were in a furious race to realise it first: the lab John joined at UCSF, a lab in Boston and the biotech company Genentech.

Scientists were facing a problem akin to finding, extracting and purifying an ounce of pure gold from a mountain the size of Everest. Where to begin? How to search? Which tools to use?

"The cloning of the genes...compared to the Shine-Dalgarno sequence understanding of protein synthesis, was a whole different thing and was a much bigger thing in many ways," said John.

"The cloning of the gene was a whole new paradigm... up until then, everyone knew you had 3.2 billion GATCs in a human or animal cell, but how you would find the insulin gene, or the growth hormone gene... about a thousand [nucleotides] long – out of 3.2 billion? It was a dream, it was impossible, [it was only the] discovery of restriction enzymes that let that field happen."

At first glance, it seemed as if he might have to chop up the whole genome into gene-sized chunks with restriction enzymes, join those chunks with plasmids, then sift through hundreds of thousands of bacterial colonies, trying to find the gene he wanted. But that would be like dismantling a mountain with a bucket and spade – not an option.

Insulin DiagramInstead, John and two other postdocs from Germany – Peter Seeburg and Axel Ullrich - chose to work on cells that naturally produced high levels of specific genes – such as pituitary cells, which produce growth hormone, and beta cells in the pancreas, which produce insulin. High levels of messenger RNAs in these cells (around 5% of the total number of RNAs produced by the cell) would code for growth hormone or insulin respectively.

In that scenario, the first task for insulin was to extract messenger RNA from beta cells, in itself a big experimental step in those days as beta cells are small, their supply was limited and RNA is extremely fragile.

The next task was to use another newly discovered class of enzyme known as a 'reverse transcriptase' to copy all the extracted messenger RNA back into DNA. The resulting mixture included DNA copies of all the RNA found in beta cells – and of that, 5% was likely to be insulin.

As human genes do not come supplied with sticky ends, these had to be generated artificially by attaching engineered ‘DNA linkers’ to each end of the genes. These linkers were sequences – GAATTC – that could be recognised and cut by a specific class of restriction enzymes.

The genes engineered with sticky ends were then combined with plasmids cut with the same restriction enzyme. In order to ensure that every scrap of the precious and laboriously come by DNA was taken up, vast quantities of plasmid were used.

This resulted in a colossal mixture of colonies that contained thousands of cloned genes (an unidentified percentage of which were insulin). It also included hundreds of thousands of clones of plasmids that had formed circles by joining up with their own sticky ends. This latter issue constituted a very big problem – how to get rid of these empty plasmids to narrow the screening base?

Reducing the Mountain to More of a Molehill

cDNAOf all the discoveries John made, the one he says gave him the greatest personal satisfaction was the solution he came up with to reduce the sheer volume of material to screen.

As it happened, because of the way the plasmids were cut, one sticky end contained a phosphate group and one sticky end contained a hydroxyl group. These molecules are highly reactive with each other, and in the normal course of recombination, they stick together.

John realised that if he treated the plasmids with an enzyme called phosphatase, it would remove the phosphate molecule from one sticky end, making it now impossible for the plasmid to link up with itself and form an empty circle.

This one piece of lateral thinking made the numbers workable. Owing to the new Maxam-Gilbert sequencing method available at the time, John was able to isolate recombinant plasmids from bacterial colonies – pure bits of DNA – and treat them in various ways to determine their sequence.

In comparison with similar experiments over the previous year, this method using phosphatase was very fast. It took him about a week to find the cloned insulin gene.

The team was the first to clone the insulin gene, growth hormone gene and placental lactogen gene. John cloned human genes well before other research groups doing similar work.

Modifying a Cloned Gene to Make it Biologically Active in Bacteria

While John had been cloning genes using the technique described above, scientists working for Genentech had approached the problem from a completely different perspective.

Genentech believed it would be too difficult to clone the natural insulin gene. As the insulin protein sequence was known, they thought that the quickest way to make human insulin would be to synthesise bits of DNA chemically, then join them together to make an artificial human insulin gene, which would code for insulin.

"Now they were wrong, but they were very fortunate," said John.

"They were wrong because as it turned out, we developed techniques to clone genes that were pretty good and we cloned the insulin gene about a year before them."

"They were fortunate because the approach they had taken gave them the technical ability to achieve their aims in two ways. First, they were able to synthesise an insulin gene from scratch with a bacterial codon preference, minimising any Shine-Dalgarno problems. Second, they had the ability to take the normal insulin gene and make mutations in any place they could foresee potential translational problems."

"For example, if they had a glycine encoded by GGA and they still wanted glycine but they didn't want GGA, they'd just make it GGC."

"So at the end of the day, they won the race to make commercially large amounts of human insulin because they had the technology that allowed them to manipulate the DNA sequence, but still retain the protein sequence. Although the codon sequence is universal – GTG means leucine no matter how you look at it – there are regulatory bits that bacteria use which aren’t in the human sequence. So they had to make it a bacterial gene coding for human insulin. And, of course, that's where all the insulin comes from."

Setting up The Centre for Recombinant DNA Research in Canberra

John came back to Australia in 1978 and set up The Centre for Recombinant DNA Research at the ANU's Research School of Biological Sciences. Over the next few years, he collaborated with scientists around Australia in using newly developed gene-cloning techniques.

He worked with the Howard Florey Institute in Melbourne to clone relaxin (a hormone produced during pregnancy); with the CSIRO and other plant researchers to clone genes involved in nitrogen fixation in plants; and in his own lab at ANU cloned and expressed the endorphin gene, which was the first human hormone from a cloned gene expressed in bacteria proven to be biologically active.

Whiteboard Eraser

John then returned to San Francisco in 1983, again to UCSF, and to work as Director of Research at start-up biotech company California Biotechnology, or CalBio.

"We grew the company from a half a dozen scientists to a couple of hundred over two years. I was eventually President and Chief Scientific Officer of the company – until I decided in 1987 to come back to Australia to an opportunity at Garvan," he said.

Listen to John describing the medical research landscape in Australia in 1990 when he became Director of Garvan, and what he believes his major achievements are since then.John Shine became Director of The Garvan Institute of Medical Research in 1990, when gene cloning and recombinant DNA were starting to have a big impact in medical research. Since then has held countless influential scientific advisory roles, including Chairman of the National Health and Medical Research Council (NHMRC) from 2003-2006 and Vice President (Biological Sciences) Australian Academy of Science from 2002-2007.