Skip to content
2000
Volume 19, Issue 1
  • ISSN: 0929-8665
  • E-ISSN: 1875-5305

Abstract

Phosphorylation is one of the most important post-translational modifications, and the identification of protein phosphorylation sites is particularly important for studying disease diagnosis. However, experimental detection of phosphorylation sites is labor intensive. It would be beneficial if computational methods are available to provide an extra reference for the phosphorylation sites. Here we developed a novel sequence-based method for serine, threonine, and tyrosine phosphorylation site prediction. Nearest Neighbor algorithm was employed as the prediction engine. The peptides around the phosphorylation sites with a fixed length of thirteen amino acid residues were extracted via a sliding window along the protein chains concerned. Each of such peptides was coded into a vector with 6,072 features, derived from Amino Acid Index (AAIndex) database, for the classification/detection. Incremental Feature Selection, a feature selection algorithm based on the Maximum Relevancy Minimum Redundancy (mRMR) method was used to select a compact feature set for a further improvement of the classification performance. Three predictors were established for identifying the three types of phosphorylation sites, achieving the overall accuracies of 66.64%, 66.11%% and 66.69%, respectively. These rates were obtained by rigorous jackknife cross-validation tests.

Loading

Article metrics loading...

/content/journals/ppl/10.2174/092986612798472893
2012-01-01
2025-09-03
Loading full text...

Full text loading...

/content/journals/ppl/10.2174/092986612798472893
Loading
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test