Seminar, 29. January 2015, Marc Hellmuth

29. January 2015, 16:15
Ernst-Abbe-Platz 2, seminar room 3423

Phylogenomics with Paralogs

Dr. Marc Hellmuth
(Center for Bioinformatics, Saarland University)

Phylogenomics heavily relies on wellcurated sequence data sets that consist, for each gene, exclusively of 1:1orthologous. Paralogs are treated as a dangerous nuisance that has to be detected and removed. We show here that this severe restriction of the data sets is not necessary. Building upon recent advances in mathematical phylogenetics we demonstrate that gene duplications convey meaningful phylogenetic information and allow the inference of plausible phylogenetic trees, provided orthologs and paralogs can be distinguished with a degree of certainty. Starting from treefree estimates of orthology, cograph editing can sufficiently reduce the noise in order to find correct event annotated gene trees. The information of gene trees can then directly be translated into constraints on the species trees. While the resolution is very poor for individual gene families, we show that genomewide data sets are sufficient to generate fully resolved phylogenetic trees, even in the presence of horizontal gene transfer.