In silico proteome analysis to facilitate proteomics experiments using mass spectrometry.
Proteomics experiments typically involve protein or peptide separation steps coupled to the identification of many hundreds to thousands of peptides by mass spectrometry. Development of methodology and instrumentation in this field is proceeding rapidly, and effective software is needed to link the different stages of proteomic analysis. We have developed an application, proteogest, written in Perl that generates descriptive and statistical analyses of the biophysical properties of multiple (e.g. thousands) protein sequences submitted by the user, for instance protein sequences inferred from the complete genome sequence of a model organism. The application also carries out in silico proteolytic digestion of the submitted proteomes, or subsets thereof, and the distribution of biophysical properties of the resulting peptides is presented. proteogest is customizable, the user being able to select many options, for instance the cleavage pattern of the digestion treatment or the presence of modifications to specific amino acid residues. We show how proteogest can be used to compare the proteomes and digested proteome products of model organisms, to examine the added complexity generated by modification of residues, and to facilitate the design of proteomics experiments for optimal representation of component proteins.