¡ù INTRODUCTION:
In the eukaryotic cells,
protein phosphorylation is one of the most ubiquitous posttranslational
modifications of proteins, orchestrating most of the cellular
processes, including the cell cycle (Lou
Y, et al., 2004), transcriptional (Uddin
S, et al., 2003) and translational regulations
(Yoshizawa
F, et al., 2002), metabolic pathways (Meijer
AJ, et al., 2004), signal transductions
(Choudhary
S, et al., 2004), and the memory (Dash
PK, et al., 2004), etc. About 2% of the
human and mouse proteomes encode protein kinases (PKs) with
518 and 540 distinct PKs determined in human (Manning
G, et al., 2002) and mouse (Caenepeel
S, et al., 2004) respectively, among which
510 are the reciprocal orthology pairs. It was estimated that
one-third of all the proteins could be phosphorylated, and about
half of kinome were disease- or cancer-related by chromosomal
mapping (Manning
G, et al., 2002). So it is in urgent need
to identify the substrates accompanied with their phosphorylation
sites in large-scale Phosphoproteome, which would help the drug
design greatly. To date, several large-scale phosphoproteomics
researches have been published for yeast (Ficarro
SB, et al., 2002), mouse (Ballif
BA, et al., 2004), human (Beausoleil
SA, et al., 2004, Lim
YP, et al., 2003) or plant (Nuhse
TS, et al., 2004), etc.
In silico
prediction of phosphorylation sites with their specific kinases
may help and alleviate the labor-intensive in vivo
or in vitro identification of phosphorylation sites
greatly. For two peptides with only one pair of different amino
acids according to their positions, we may assume with confidence
that they have similar 3D structures and biochemical characteristics,
especially when the two different amino acids are a conserved
pair, e.g. isoleucine (I) and valine (V), or serine (S) and
threonine (T). Based on this observation, we design a simple
scoring method, GPS
(Group-based Phosphorylation
Scoring method),
with more meaningful information to biologists while possessing
satisfying performance compared to two phosphorylation sites
prediction systems, ScanSite 2.0 (Obenauer
JC, et al., 2003), and PredPhospho (Kim
JH, et al., 2004). With data from the public
database Phospho.ELM/PhosphoBase (Diella
F, et al., 2004, Kreegipuu
A, et al., 1999)and extensive literature
curation, our prediction was enlarged into 71
Protein Kinase (PK) families/PK groups (ver 1.10, and 52 PK groups in ver 1.0).
|
 |