Protein identification is an essential component in the study of system biology. Mass spectrometer has become a conventional platform for proteomics because of its accurate mass and intensity measurement. Proteins of interest are first extracted from cell lines or tissues, digested to peptides by selected enzymes, and introduced to capillary electrophoresis (CE) or liquid chromatography (LC) coupled with mass spectrometer for the detection of peptide masses and their intensities in biological samples. Protein database search is a typical approach for protein identification, where the tandem mass spectra (MS/MS) acquired from the mass spectrometer are searched against a target protein database for identification. Proteome Discoverer (PD), developed by Thermo Fisher Scientific Inc., is a very popular commercial software tool for proteomic research and is widely used by numerous biological laboratories. Since analyzing proteomic data requires several complex techniques, Thermo Fisher Scientific Inc. decides to provide a list of application program interfaces (API) so bioinformatic developers can add their tools into PD, providing more comprehensive workflows. In this three-year project, we plan to add two public software tools, MSFragger和PeptideProphet (Philosopher), into PD as two processing nodes. Our goals are to: (1) provide ultrafast protein identification, (2) allow two database search approaches, closed and open search, and (3) improve the open search results. The major difference between the aforementioned two searches is the peptide mass tolerance, where the closed search requires small mass tolerance (e.g., 20 ppm) for accurate peptide identification, and the open search requires large mass tolerance (e.g., 500 Da.) for uncovering potential post-translational modifications. We have implemented a prototype package, called MSFragger-PD node, and made it publically available to users. According to our evaluation, the processing time of the MSFragger-PD node is 20 times faster than PD’s conventional search engine, and more peptides and proteins are identified using the node. Currently this prototype package has been downloaded over 700 times and we have received many positive user feedback. We believe that the MSFragger-PD node will be very beneficial not only to the protoeomic research but also to the industries.
|Effective start/end date
|1/01/22 → 31/07/22
UN Sustainable Development Goals
In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This project contributes towards the following SDG(s):
- Liquid chromatography coupled with mass spectrometry
- protein identification
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.