Imagine an LGT event involving a small fragment of gene, say 30% of the full-length of the gene. In this instance, a typical phylogenetic approach would not have detected the recombined region, because the reticulated phylogenetic signal is not as strong as in the case of say, LGT involving 85% or full-length of the complete gene. Therefore, the extent of LGT based on "whole-gene transfer" assumption in the common practice of phylogenetics would have been underestimated.
So how much are we missing out here? That's pretty much what my PhD thesis is about. In my most recent paper appeared online yesterday in Genome Biology and Evolution, "Lateral transfer of genes and gene fragments in prokaryotes" (advance access Nov 4, doi:10.1093/gbe/evp044), I, Rob, Aaron and Mark, describe the extent of LGT in prokaryotes, relaxing the "whole-gene transfer" assumption.
In this analysis, closely related to our previous paper in PLoS ONE in which we introduced the terms domon and nomon, I divided the gene sets into the so-called ORB+ and ORB- sets. I use the acronym ORB to represent "observable recombination breakpoint", so the ORB+ sets are simply gene sets within which a recombination breakpoint is detected, and the ORB- sets are genes sets showing LGT history within which no recombination breakpoint can be found. The former represents lateral transfer of gene fragments, and the latter represents lateral transfer of whole genes and beyond.
It seems there are more ORB+ sets than the ORB- sets, suggesting the "whole-gene transfer" assumption seriously underestimate the extent of LGT in phylogenetic studies. Complementing the analysis of LGT and domon boundaries, this work represent the "best-practice examination of within-gene LGT, given current datasets and inference methods".
However, the results from this work should be interpreted with extreme caution, as outlined in the paper. "Our findings describe the propensity of a gene set to have suffered LGT, not the number of LGT events, therefore our result must be interpreted strictly against that definition. ..." Besides, we only focused on single-copy gene families in this work, avoiding the complication of paralogy in the inference of LGT (xenology). Therefore the number of gene sets showing LGT history in this work is still a conservative estimate.
I think this is my best work yet, and has gotten rather favourable comments from the reviewers. Reviewer 1 thinks this work "is sound, informative and the authors are careful with their conclusions", whereas Reviewer 2 wrote, "the article by Chan et al analyzes and important and timely topic: to what extent is the gene the most useful unit to analyze gene transfer events. ... The findings are novel and interesting... The article makes a useful contribution..."
I hope you'll enjoy reading the paper as much as we enjoy preparing it. Cheers!

