Almost all diseases manifest themselves as changes in the expression, abundance or signaling status of proteins. Therefore, the precise analysis of the proteome (the entirety of a biological system’s proteins) is a crucial step in the understanding, diagnosis, and treatment of diseases. Mass spectrometry-based proteomics is a powerful technique for the simultaneous analyses of thousands of proteins, fueling biomarker research and drug discovery.
Proteomic data analysis
The analysis of proteomic data heavily relies on the automated matching of acquired tandem mass spectra of peptides (fragments of proteins) to protein sequence databases. This process relies on simple assumptions and the key concepts have remained largely unchanged since their introduction in 1993. We believe that we only see the tip of the iceberg. To date, only half of the data acquired from a sample can be identified using classical data analysis workflows, leading to lost productivity, precious samples and opportunities.
The power of deep learning
Recent developments in the field of machine learning revolutionize all branches of research. Artificial neural networks learn to perform tasks without previously defined rule sets, solely based on annotated training data. We have learned to harness this power to predict properties of peptides like liquid chromatography retention time or fragmentation behavior inside the mass spectrometer.
Predicting peptide properties
The MSAID founders developed a generic deep learning framework called INFERYS which learns to predict any peptide property from training data. INFERYS demonstrates superior accuracy performance well above all other current approaches. The algorithm was trained using millions of mass spectra and can be adapted to all common mass spectrometers with minimal additional training. The model is universally applicable to proteins from any organism, creating huge opportunities in areas such as Immunology, Proteogenomics, or Metaproteomics.
Relevant publications by members of the MSAID team
We are proud of our strong track record in the field of proteomics and will continue to research and innovate together with partners from industry and academia.
- INFERYS Rescoring: boosting peptide identifications and scoring confidence of database search results
Zolg DP, Gessulat S, Paschke P, Garber M, Rathke-Kuhnert M, Seefried F, Fitzemeier K, Berg F, Lopez-Ferrer D, Horn D, Henrich C, Huhmer A, Delanghe B and Frejno M Rapid Commun Mass Spectrom.2021;e9128 DOI: 10.1002/rcm.9128.
- Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning Gessulat S, Schmidt T, Zolg DP, Samaras P, Schnatbaum K, Zerweck J, Knaute T, Rechenberger J, Delanghe B, Huhmer A, Reimer U, Ehrlich HC, Aiche S, Kuster B, Wilhelm M Nature Methods 2019 DOI: 10.1038/s41592-019-0426-7.
- ProteomeTools: Systematic Characterization of 21 Post-translational Protein Modifications by Liquid Chromatography Tandem Mass Spectrometry (LC-MS/MS) Using Synthetic Peptides
Zolg DP, Wilhelm M, Schmidt T, Médard G, Zerweck J, Knaute T, Wenschuh H, Reimer U, Schnatbaum K, Kuster B. Mol Cell Proteomics 2018 DOI: 10.1074/mcp.TIR118.000783
Schmidt T, Samaras P, Frejno M, Gessulat S, Barnert M, Kienegger H, Krcmar H, Schlegl J, Ehrlich HC, Aiche S, Kuster B, Wilhelm M. Nucleic Acids Research 2018 DOI: 10.1093/nar/gkx1029
- Building ProteomeTools based on a complete synthetic human proteome
Zolg DP, Wilhelm M, Schnatbaum K, Zerweck J, Knaute T, Delanghe B, Bailey DJ, Gessulat S, Ehrlich HC, Weininger M, Yu P, Schlegl J, Kramer K, Schmidt T, Kusebauch U, Deutsch EW, Aebersold R, Moritz RL, Wenschuh H, Moehring T, Aiche S, Huhmer A, Reimer U, Kuster B Nature Methods 2017 DOI: 10.1038/nmeth.4153
- Pharmacoproteomic characterisation of human colon and rectal cancer
Frejno M, Zenezini Chiozzi R, Wilhelm M, Koch H, Zheng R, Klaeger S, Ruprecht B, Meng C, Kramer K, Jarzab A, Heinzlmeir S, Johnstone E, Domingo E, Kerr D, Jesinghaus M, Slotta-Huspenina J, Weichert W, Knapp S, Feller SM, Kuster B Mol Syst Biol. 2017 DOI: 10.15252/msb.20177701
- Mass-spectrometry-based draft of the human proteome
Wilhelm M, Schlegl J, Hahne H, Gholami AM, Lieberenz M, Savitski MM, Ziegler E, Butzmann L, Gessulat S, Marx H, Mathieson T, Lemeer S, Schnatbaum K, Reimer U, Wenschuh H, Mollenhauer M, Slotta-Huspenina J, Boese JH, Bantscheff M, Gerstmair A, Faerber F, Kuster B. Nature 2014 DOI: 10.1038/nature13319