Skip to main content

Towards automated discrimination of lipids versus peptides from full scan mass spectra.


AUTHORS

Dittwald PPiotr , Nghia VT Vu Trung , Harris GA Glenn A , Caprioli RM Richard M , Van de Plas R Raf , Laukens K Kris , Gambin A Anna , Valkenborg D Dirk . EuPA open proteomics. 2014 9 1; 4(). 87-100

ABSTRACT

Although physicochemical fractionation techniques play a crucial role in the analysis of complex mixtures, they are not necessarily the best solution to separate specific molecular classes, such as lipids and peptides. Any physical fractionation step such as, for example, those based on liquid chromatography, will introduce its own variation and noise. In this paper we investigate to what extent the high sensitivity and resolution of contemporary mass spectrometers offers viable opportunities for computational separation of signals in full scan spectra. We introduce an automatic method that can discriminate peptide from lipid peaks in full scan mass spectra, based on their isotopic properties. We systematically evaluate which features maximally contribute to a peptide versus lipid classification. The selected features are subsequently used to build a random forest classifier that enables almost perfect separation between lipid and peptide signals without requiring ion fragmentation and classical tandem MS-based identification approaches. The classifier is trained on in silico data, but is also capable of discriminating signals in real world experiments. We evaluate the influence of typical data inaccuracies of common classes of mass spectrometry instruments on the optimal set of discriminant features. Finally, the method is successfully extended towards the classification of individual lipid classes from full scan mass spectral features, based on input data defined by the Lipid Maps Consortium.


Although physicochemical fractionation techniques play a crucial role in the analysis of complex mixtures, they are not necessarily the best solution to separate specific molecular classes, such as lipids and peptides. Any physical fractionation step such as, for example, those based on liquid chromatography, will introduce its own variation and noise. In this paper we investigate to what extent the high sensitivity and resolution of contemporary mass spectrometers offers viable opportunities for computational separation of signals in full scan spectra. We introduce an automatic method that can discriminate peptide from lipid peaks in full scan mass spectra, based on their isotopic properties. We systematically evaluate which features maximally contribute to a peptide versus lipid classification. The selected features are subsequently used to build a random forest classifier that enables almost perfect separation between lipid and peptide signals without requiring ion fragmentation and classical tandem MS-based identification approaches. The classifier is trained on in silico data, but is also capable of discriminating signals in real world experiments. We evaluate the influence of typical data inaccuracies of common classes of mass spectrometry instruments on the optimal set of discriminant features. Finally, the method is successfully extended towards the classification of individual lipid classes from full scan mass spectral features, based on input data defined by the Lipid Maps Consortium.