Seminar, 28. June 2016, Ekaterina Shelest

28. June 2016, 16:15 p.m.

Ernst-Abbe-Platz 2, seminar room 3423

Promoter-based prediction of gene clusters in eukaryotic genomes
(with application to secondary metabolite clusters)

Dr. Ekaterina Shelest
(Leibniz-Institut für Naturstoff-Forschung und Infektionsbiologie e. V., Hans-Knöll-Institut (HKI), Jena)

Genomic clustering of functionally interrelated genes is not unusual in eukaryotes. In such clusters, co-localized genes are co-regulated and often belong to the same pathway. However, biochemical details are still unknown in many cases, hence computational prediction of clusters’ structures is beneficial for understanding their functions. Yet, in silico detection of eukaryotic gene clusters (eGCs) remains a challenging task, mainly due to the high variability of the clusters’ content and lack of other distinguishing sequence features.

We suggest a novel method for eGC detection based on consideration of cluster-specific regulatory patterns. The basic idea is to differentiate cluster from non-cluster genes by regulatory elements within their promoter sequences using the density of cluster-specific motifs’ occurrences (which is higher within the cluster region) as an additional distinguishing feature. The method searches for “islands” of enriched cluster-specific motifs in the vicinity of anchor genes. We called it CASSIS (Cluster Assignment by Islands of Sites).

This approach was specifically adjusted for the search of secondary metabolite gene clusters. Secondary metabolites (SM) are structurally diverse natural products of high pharmaceutical importance. Genes involved in their biosynthesis are often organized in clusters. CASSIS was validated in a series of cross-validation experiments and showed high sensitivity and specificity. The effectiveness of the method was demonstrated by successful re-identification of functionally characterized clusters. CASSIS has been applied to the detection of yet unknown SM gene clusters in fungal genomes.