Predicting Proteomics

A new computer model could improve quantitative proteomics and speed up lead optimisation by predicting how a protein will break down during analysis.

A new study published in January's edition of Nature Biotechnology from researchers at the Institute for Systems Biology, USA; the University of California, USA; and Cellzome, Germany, highlights the benefits of a new computational technique to predict which peptide fragments should be observed for a given protein during LC/MS/MS.

Proteomics has been part of a radical transformation of biological and medical research, by studying which proteins are present in cells, how they interact and what they do. The differences in the abundance of different proteins in distinct samples has led to the identification of cellular functions and pathways affected by disease as well as enabling the detection of disease biomarkers.

Mass spectrometric (MS) peptide sequencing has become a key step in nearly all proteomic experiments but the results are often biased towards repeatedly detecting the protein fragment with the most intense MS signal, thus limiting the efficiency and depth of the analysis.

In fact, many peptides are never detected, either because they stick to the liquid chromatography (LC) column or they are too large or too small to be detected by the mass spectrometer. Typically, those fragments or peptide chains that are between seven and 30 amino acids long fall in the range detected by standard mass spectrometers.

The study found that, on average, the identification of only three distinct peptides is needed to identify the majority (95 per cent) of proteins, and for over 25 per cent of proteins only one peptide is needed for identification. These identifying protein fragments have been dubbed proteotypic peptides.

By defining approximately 500 physical and chemical properties of various peptides including charge, hydrophobicity and secondary structure, the researchers were able to find which properties best distinguished the proteotypic peptides from those peptides that are unobserved. Interestingly, the number of proteotypic peptides observed for a protein was not merely a function of the protein length, but factors such as amino acid composition and transmembrane domains also had a significant influence.

The predictors showed good agreement between observed and predicted peptides, even for human gamma-secretase, a protein associated with the development of amyloid plaques in the brains of Alzheimer's patients that is notoriously difficult to analyse.

According to one of the authors, Dr Bernhard Kuster, Cellzome's vice president of analytical sciences and informatics, "one of the motivations behind this research is that proteomics can be a bit hit and miss, we need to measure the proteins in a more quantitative way."

"We have found a number of proteins that we don't predict proteotypic peptides for, highlighting a blind spot of current LC/MS/MS techniques, some proteins have no peptides suitable for detection."

Even with the current state of the art methods of protein identification, some parts of the proteome are not detected. The use of these predictors could allow the development of new experimental methodologies that ensure that important proteins are not missed.

Kuster believes that the current techniques are "a good compromise between seeing everything and seeing enough," especially when it comes to high throughput screening, where speed is of the essence.

By using these new techniques in conjunction with its Kinobead technology, Cellzome can study the interaction of small molecules with over 200 kinase targets in a high throughput parallel assay. The active molecules, whether they are hits, leads, development candidates or marketed drugs, are tethered to a solid support and facilitate the isolation of those proteins that bind to the candidate.

The isolated proteins can then be identified by LC/MS/MS. According to Kuster: "the technique is much faster than in-vitro kinase assays and you can see many more kinases that you can't buy an assay for."

The new methodology will find application as a workflow improvement tool and will allow screening methods to be much faster by using an MS technique called multiple reaction monitoring (MRM). MRM allows the mass spectrometer to be focused on the detection of the specific signals from proteotypic peptides one after the other, rather than scanning the entire mass range.

The combination of using the predictors with LC/MS/MS/MRM will also help in the identification of serum biomarkers and monitoring how they change during pathological conditions. According to Kuster, these changes are often incredibly hard to detect, as the proteomic data is so noisy.