Thursday, November 5, 2009

Fragmentary vs non-fragmentary LGT

From a good number of published analyses, we learned that LGT (or for the Americans, HGT) is extensive and common among the prokaryotes, in Archaea and in Bacteria. The most common approach for studying LGT is the phylogenetic approach, in which LGT is inferred if a gene tree show incongruent tree topology in comparison to a reference species tree. The assumption? Genes are transferred as a whole.

Imagine an LGT event involving a small fragment of gene, say 30% of the full-length of the gene. In this instance, a typical phylogenetic approach would not have detected the recombined region, because the reticulated phylogenetic signal is not as strong as in the case of say, LGT involving 85% or full-length of the complete gene. Therefore, the extent of LGT based on "whole-gene transfer" assumption in the common practice of phylogenetics would have been underestimated.

So how much are we missing out here? That's pretty much what my PhD thesis is about. In my most recent paper appeared online yesterday in Genome Biology and Evolution, "Lateral transfer of genes and gene fragments in prokaryotes" (advance access Nov 4, doi:10.1093/gbe/evp044), I, Rob, Aaron and Mark, describe the extent of LGT in prokaryotes, relaxing the "whole-gene transfer" assumption.

In this analysis, closely related to our previous paper in PLoS ONE in which we introduced the terms domon and nomon, I divided the gene sets into the so-called ORB+ and ORB- sets. I use the acronym ORB to represent "observable recombination breakpoint", so the ORB+ sets are simply gene sets within which a recombination breakpoint is detected, and the ORB- sets are genes sets showing LGT history within which no recombination breakpoint can be found. The former represents lateral transfer of gene fragments, and the latter represents lateral transfer of whole genes and beyond.

It seems there are more ORB+ sets than the ORB- sets, suggesting the "whole-gene transfer" assumption seriously underestimate the extent of LGT in phylogenetic studies. Complementing the analysis of LGT and domon boundaries, this work represent the "best-practice examination of within-gene LGT, given current datasets and inference methods".

However, the results from this work should be interpreted with extreme caution, as outlined in the paper. "Our findings describe the propensity of a gene set to have suffered LGT, not the number of LGT events, therefore our result must be interpreted strictly against that definition. ..." Besides, we only focused on single-copy gene families in this work, avoiding the complication of paralogy in the inference of LGT (xenology). Therefore the number of gene sets showing LGT history in this work is still a conservative estimate.

I think this is my best work yet, and has gotten rather favourable comments from the reviewers. Reviewer 1 thinks this work "is sound, informative and the authors are careful with their conclusions", whereas Reviewer 2 wrote, "the article by Chan et al analyzes and important and timely topic: to what extent is the gene the most useful unit to analyze gene transfer events. ... The findings are novel and interesting... The article makes a useful contribution..."

I hope you'll enjoy reading the paper as much as we enjoy preparing it. Cheers!


Friday, March 27, 2009

Microbiol, Mol Biol, and Evol

I came across a really nice story that I thought I would share it with you. The story about the bittersweet triangle relationship among microbiology, molecular biology and evolution, about how the nature evolutionary process was neglected and 'mishandled' by the 20th century biology.

The legendary Carl Woese and his co-author Nigel Goldenfeld, in a commentary that recently appeared in the March issue of Microbiology and Molecular Biology Reviews [MMBR 2009, 73(1):14-21], has a very interesting story to tell.

I would love to share with you some of the quotes from them, which I think it's rather refreshing and entertaining.

"The problem (the negligence of the nature of evolutionary process) has suffered the indignity of being dismissed as unimportant to a basic understanding of biology by molecular biology; it went effectively unrecognized by a microbiology still in the throes of trying to find itself; and it became the private domain of a quasi-scientific movement, who secreted it away in a morass of petty scholasticism, effectively disguising the fact that their primary concern with it was ideological, not scientific."

With the analogy from the field of physics (from the Einstein theory of relativity to the quantum theory), they described the history of how different fields of science had converged and induced the development of modern science, and relate the incidence in the 20th Century to biology today. In telling the story of how the field of molecular evolution was gaining momentum in biological research, the authors included a couple of excerpts of correspondence between Woese and Francis Crick (yes, one of the DNA dudes) in the 60's, which I find rather interesting.

Furthermore, they nicely put, "biology is a study, not in being, but in becoming", and that evolution has been "the sick man of science" throughout the last century. A very intriguing analogy. I find it hard to disagree when they described that molecular biology has not recognized the evolutionary process as a scientific problem, but rather a biological epiphenomenology, or "historical accident".

One of my favourite quotes that they cited from Alfred North Whitehead is "A science which hesitates to forget its founders is lost". Indeed.

I like the way how they summed up the story:

"Microbiology has reached a dead end in its uninspired search for a proper, 'natural' taxonomy (which it desperately needed). Molecular biology was at a dead end (but didn't know it) in its attempt to understand the gene (having failed completely with the problem of 'gene expression'. Few appreciated that both microbiology's foundational issue and molecular biology's conceptual failure resulted from the inability to see that an evolutionary conceptualization was required to resolve them."

I totally enjoyed reading this article, and I'd recommend to anyone who might be interested. The title "How the microbial world saved evolution from the Scylla of molecular biology and the Charybdis of the modern synthesis", is attractive enough for me, :).

Here is the opening quote in the article, by J. Robert Oppenheimer (in his book The Open Mind):

"There must be no barriers for freedom of inquiry. There is no place for dogma in science. The scientist is free, and must be free to ask any question, to doubt any assertion, to seek for any evidence, to correct any errors."

Sunday, February 22, 2009

Frustrated post-doc crossing












Speaking of transfer or crossover, one of the PhD comics recently caught my attention. I wonder if these road signs really exist. If yes, where?

Absent-minded professors are everywhere, including some with super-ego, that's for sure.

Undergrads on bikes, it might as well be "drunk undergrads".

A "gaggle" of grad students, how ironic as procrastination plays such a huge part in a grad student's life, lol.

But my favourite is definitely the "frustrated post-doc x-ing" sign. I think "frustrated" is a very appropriate expression for post-docs. There's always so much work for so little pay, so much politics so little gain, so much pressure for so little time.

It is rather demeaning when a person with a doctorate degree (after spending years of effort on it) is earning less than some fresh graduates working at an entry-level position in some companies. This is especially true in America. Postdocs in USA is certainly underpaid. I'm sure the NIH knows about it, and I hope they will revise the postdoc's salary scale soon.

A postdoc's salary is not something you can measure with the hours you spent working a week. Sigh. If only my J1 visa allows me to work on some other part-time jobs and earn some extra income, although, I am supposed to spend 24/7 making my boss rich and famous, right?

Friday, February 20, 2009

Are protein domains modules of LGT?

Lateral genetic transfer (LGT, or horizontal genetic transfer, HGT) has been studied extensively among the prokaryotes (e.g., Beiko et al., 2005, PNAS), and there are more and more published studies of LGT among the eukaryotes, or between prokaryotes and eukaryotes. A recent work published in PNAS for example, demonstrated a very intriguing LGT phenomenon between the sea slug and the green algae that the it feeds on (Rumpho et al., 2008, PNAS).

Most of the published studies are based on the implicit assumption that genes are transferred as a whole. But we know that genetic materials can also be transferred in fragments, or in a cluster of a few genes. Up till yesterday, there was no published study that examines the units of LGT in a rigorous manner. But today, there is one: Chan et al., 2009, "Are protein domains modules of lateral genetic transfer?" PLoS ONE, 4(2): e4524. (http://dx.plos.org/10.1371/journal.pone.0004524).

This paper represents the highlight of my PhD work at UQ with Mark Ragan, Rob Beiko and Aaron Darling. Using data from 144 prokaryote genomes, we tried to answer the big question of whether protein domains are modules of LGT. When genes are transferred in fragments, do these fragments correlate to protein domains (the structurally compact regions of proteins)? Surprisingly, or not surprisingly to some, we found no evidence to prove such correlation exists. Protein domains are units of function, but not modules of LGT.

In this work, we also coined two new terms:

(a) domon, to describe gene (exon) region that encodes a protein domains, and

(b) nomon, to describe gene (exon) region that encodes part of a protein not recognized as a domain.

These terms are handy when one tries to describe the exact physical units of genetic transfer. I personally came across some papers in which the authors used the terms gene and protein loosely within the context of LGT, but people need to be aware that it is always the gene that is transferred in the event of LGT, not proteins or the encoded gene products.

Although the results are rather negative, we hope this work provides a novel enough insight on the physical unit of LGT, and the process and mechanism of LGT in shaping and innovating genomes.

Your comments and criticism are very welcome!