RDC - PROSPECT Feb 18, 2004 version 1.65 INTRODUCTION: RDC-PROSPECT is a computer program for protein structure prediction based on a structural homolog or analog of the target protein in the Protein Data Bank (PDB), which best aligns with the 15N-1H RDC data of the protein recorded in a single ordering medium. It uses only RDC data and predicted secondary structure information. Threading is performed against SCOP 40 structure database (currently v1.65, release of Dec, 2003). The alignment and Z-score for the top 20 ranked templates are reported in the output file. Please note only those templates that have the sequence length of between 50% to 200% of the test protein sequence length are considered. For protein domains, templates fall outside this sequence length range are not likely to belong to the same family or superfamily with the test protein and thus not considered. The template rotations are now made around Z, Y', Z'' axes. This is different from the original X, Y, Z rotations as used in our NAR paper. This is because it appears the Z, Y', Z'' rotations are more widely used in other programs. So we just follow the tradition. RDC-PROSPECT is written in C and has only been tested under LINUX. The current version 1.65 can only use N-H RDC data in one aligning medium. Use of multiple types of RDC and RDC in multiple media will be added in the next version. AUTHORS: Youxing Qu, Jun-tao Guo, Victor Olman, Ying Xu Computational Systems Biology Lab Life Sciences Building A110 Dept of Biochemistry and Molecular Biology University of Georgia Athens, GA 30602 U.S.A. Phone: 706-542-9762 Fax: 706-542-9751 Web: http://csbl.bmb.uga.edu REFERENCES: Qu Y, Guo JT, Olman V, Xu Y. (2004) Protein structure prediction using sparse dipolar coupling data. Nucleic Acids Res.,32, 551-561. Y. Qu, J.-T. Guo, V. Olman, and Y. Xu, Protein Fold Recognition Through Application of Residual Dipolar Coupling Data, Pacific Symposium on Biocomputing, 2004, 459-470. PROBLEMS AND SUGGESTIONS: Please contact Dr. Youxing Qu E-mail: youxing@csbl.bmb.uga.edu or Dr. Ying Xu E-mail: xyn@bmb.uga.edu Thank you for using RDC-PROSPECT and any comment is appreciated ! UPDATE Please send us an email after you download the RDC-PROSPECT program. We will send you information on update or new versions when they become available. **************************************************************************************************** **************************************************************************************************** PROGRAM PACKAGE: The package includes 2 directories: program - all executable codes, compiled under redhat9.0 LINUX. The program must be run in this directory! It will try to find template files in the "../template" directory. templates - all the templates files (5,674 domains from SCOP 40 v1.65) for threading. Please note these files are NOT PDB files! INPUT: The input file contains protein sequence (sequence, secondary structure) and RDC data information (Da, Dr, N-H RDC). Please see the provided example.inp file for detail instructuions. OUTPUT: The output file contains the alignment information and Z-scores for the top 20 ranked templates. Please see the provided example.out file for details. RANK_# - ranking of the templates is based on alignment raw score (SCORE) SEQIDEN_% - sequence identity between the test protein and the template protein S - RDC order parameter Q - RDC Q value R - RDC R factor TEMP_# - template index number TEMPLATE - template name (SCOP domain name) FAMILY - SCOP family LENGTH - the sequence length of the template protein QUERY - the sequence length of the test (query) protein SCORE - alignment raw score Z - Z-score of the alignment ANG_Z - Euler angle for clockwise rotation around Z axis ANG_Y' - Euler angle for clockwise rotation around Y' axis ANG_Z'' - Euler angle for clockwise rotation around Z'' axis query - test protein t - template The detailed alignment is given under "Structure_Alignment:" for each template. RUN RDC-PROSPECT: rdc.exe test.inp > test.out & if not working, please try ./rdc.exe test.inp > test.out & STATUS CHECK: A temporary tmp.out file is created to report the threading scores (no Z-score) of the templates that have been checked. The tmp.out will remain in the directory after the program finishes, and it will be overwritten next time when RDC-PROSPECT is run. Other temporary files (file name ending with _tmp) will be removed when the program finishes. An example_tmp.out file is provided for reference. START - start residue number of the template END - end residue number of the template NOUSE - this column is not used in this version of PROSPECT. It is prepared for future use. **note: -888 and -999 in the output files indicate the template is not used. To check the status of the program running, use more tmp.out or tail -f tmp.out IMPORTANT ! Only run one job each time under a directory ! Otherwise you will get bizzare results. However, the program can be run simultaneously under different directories, provided both the "program" and the "templates" folders are present under each directory. COMPUTER TIME: RDC-PROSPECT takes <3 hours (computer time, not CPU time) to run for a 100-residue test protein on a P4/2.4GHz/1GB PC under LINUX.