General Outputs
Energy Components
The output file provides the scores of each energy term, e.g.,
The raw score of the alignment: -1240.4 (-1242) Mutation = -1355.0;
Singleton = -774.5; Pairwise = -290.8; InternalPair
= -31.1; GapPenalty = 2211.0; SSFit = 0.0;
AnchorMatch = -1000.0 (1-anchor); NMR-Backbone; = 0.0;
PairConstrn = 0.0 (0-pair); PairVialate = 0.0.
The raw score of the alignment: total energy.
The first number is calculated according to the final alignment; the
number (integer) in brackets is the result of combinatorial search. One
can estimate the accuracy of threading from the difference between the
two numbers.
Mutation: mutation energy component of cores.
Singleton: singleton energy component of cores.
Pairwise: the component of pairwise energy between
cores.
InternalPair: the component of pairwise energy
within cores.
GapPenalty: gap penalty component together with
mutation and singleton energies in the non-core regions.
SSFit: the fitness between the secondary structure
prediction and the secondary structures assigned from the PROSPECT alignment,
when using the secondary structure prediction as an input.
AnchorMatch: the contribution of matching the
predefined alignments between the anchor residues on the target sequence
and the positions on the template.
PairConstrn: the contribution of matching the
predefined pairs between the residues on the target sequence.
The notations in sort file is: "C-ndx" for the normalized radius
of gyration, "raw" for the total energy, "mut" for mutation energy component,
"sing" for the singleton energy component, and "pair" for the total pairwise
energy component.
Confidence Assessment of Threading Results
The following considerations can be used to assess the confidence
level of the threading results:
- Z-score. The higher z-score, the more reliable prediction.
Instead of using z-score directly, more informative confidence index is
used. The confidence index is defined as the probability of a sequence-template
pair with a certain z-score being a related protein pair. They are estimated
by running a large number of threadings and by counting the number of true
positives as a function of z-score.
| z-score interval |
Condfidence Index
|
Category
|
Similarity Level
|
<6
|
~0
|
unlikely
|
unrelated
|
6-8
|
0.35
|
low
|
superfamily/fold
|
8-10
|
0.63
|
medium
|
superfamily/fold
|
10-12
|
0.85
|
high
|
family/superfamily
|
12-20
|
0.96
|
very high
|
family/superfamily
|
>20
|
1.00
|
certain
|
family
|
- Normalized radius of gyration. It gives a good estimate
for the compactness of the aligned portion in threading. If the value
is above 3.0, the aligned portion is too uncompact, unless the template
is a multi-domain protein.
- Correlation between the secondary structure prediction
and the secondary structures assigned from the PROSPECT alignment. It
is known that secondary structure predictions have an accuracy of 70%.
If the correlation is too low for a sizable target protein (more than
100 amino acids), the prediction may not be reliable.
Generating Structures from Alignments
The output file provides detailed information for users to analyse
the alignments. The core alignment shows the aligned positions of cores,
their secondary structure types ("h" for alpha-helix and "e" for beta-sheet),
and their starting residue number in PDB. If the aligned residues are
the same, it is marked by "|"; if they are similar, it is marked by ":";
if they are related, it is marked by ".". Residue range in the template
gives the residue number and the extra character associated with the the
residue number (if any) for the starting and ending positions of the aligned
portion in the template. One can view the template portion in the range
using a molecular graphics program.
|