Seminar, 12. January 2017, Sebastian Böcker

12. January 2017, 16:15 p.m.

Ernst-Abbe-Platz 2, seminar room 3423

Searching molecular structure databases with tandem MS data: SIRIUS and CSI:FingerID

Prof. Dr. Sebastian Böcker
(Chair for Bioinformatics, Institute of Computer Science, FSU Jena)

Metabolomics, the identification and quantification of small compounds, is receiving increasing interest in fields such as natural products and pharmaceutics, biomarker discovery, functional genomics, and environmental. Structural elucidation of small compounds remains a challenging problem, in particular for those compounds that cannot be found in (notoriously incomplete) spectral libraries. Searching tandem mass spectra in molecular structure databases is considered a promising strategy for elucidation of a compound. Development of computational approaches for this problem has been vibrant during the last years, with the potential to finally identify the so-called “dark matter of metabolomics”.

In my talk, I will explain current approaches for this problem and, in particular, describe recent developments of our own methods SIRIUS (for finding molecular formulas and computing fragmentation trees) and CSI:FingerID (for searching in a molecular structure database). CSI:FingerID was able to correctly identify 70 out of 127 compounds in the CASMI 2016 challenge (positive ion mode) when searching ChemSpider with about 35 million structures; this is more than twice as many correct identifications as the runner-up method. In November 2016, the CSI:FingerID web service analyzed 1500 compounds per day. Our tools are currently being integrated into GNPS (Wang et al., Nature Biotech 2016) and OpenMS (Röst et al., Nature Methods 2016) by their developers. Finally, I will shortly describe novel work on structural elucidation beyond CSI:FingerID, such as Deep Learning for compound category prediction.