Orb - Chemical Shift Prediction (Version 1.0)

Purpose: Orb is a Tcl program which predicts chemical shifts for a given sequence based on statistical analysis and/or previously assigned shifts of homologous sequences.

  1. Introduction

  2. Preparations

  3. Running the program


Overview

A good prediction of the chemical shifts for a sequence can be an invaluable aid in the NMR assignment process. Many times researchers already have shifts from homologous sequences and want to effectively use this information. Even without homologous sequence shifts a researcher may feel a prediction based on statistical analysis is rough but useful starting point.

The user puts all the homologus sequences into the xalign program to generate a sequence alignment file. The user then starts the orb program entering the sequence alignment file and the name of the directory containing the pertinent chemical shift files. The user selects the sequence to predict and selects from a group of options on the manner of the prediction. When the user hits the execute button, a prediction shift file (among others) is produced and user views the output by selecting the "Display Results" button.


Main Screen Snapshot

Authors

Authors: Wolfram Gronwald, Robert Boyko, Tim Jellard, David Wishart, Frank Soennichsen, Brian Sykes

Funding for this project has been provided by the Medical Research Council of Canada and the Protein Engineering Networks of Centres of Excellence (Canada).


Installation

Executable versions of this program for suns or sgis are freely available at our ftp site. First you will need to download the software from our website:

Even though orb is a standalone program, we strongly recommend getting any optional software as outlined on the download page. Once you have downloaded the software, you then proceed by uncompressing and untarring the files:

	uncompress myfile.tar.Z
	tar xvf myfile.tar

You should then take a look at the README file to understand what files are being installed and the installation options you have. After this, type "Install" to put the files in the appropriate places.

The current version of this software comes with an expiry date. If your software has expired, check out the website above for further instructions or new versions.


Preparing Data Files

Although orb is a fairly easy program to use, there is a fair amount of work in data preparation. Please carefully follow the instructions below.

Do the following if you do NOT have any shift data from homologous sequences.

Create an input sequence file for your protein similar to the example below:

	# This is an example sequence
	>CaM Calmodulin - Drosophila melanogaster (1-148)
	ADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQD
	ccccchhhhhhhhhhhhhhccccccbbbhhhhhhhhhhcccccchhhhhh
	MINEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGFI
	hhhhhccccccbbbhhhhhhhhhhhhhcccchhhhhhhhhhhhcccccbb
	SAAELRHVMTNLGEKLTDEEVDEMIREANIDGDGQVNYEEFVTMMTSK
	bhhhhhhhhhhcccccchhhhhhhhhhcccccccbbbhhhhhhhhhcc
Notes:

Orb will only use statistical database values for making predictions.

Do the following if you have shift data from one or more homologous sequences.

  1. From your set of homologous sequence shift data files, select those which seem most applicable to the protein you wish to assign. If you are new to this program, I would only take one or two datasets just to keep things more simple.

  2. Convert your homologous sequence shift data files to PPM format.

  3. Create an xalign sequence input file which contains your sequence to predict and the homologous sequences. At the end of each sequence name of your xalign input file, indicate the amino acid numbering within curly braces. For example:
    	>IL8.1a Interleukin 8  {1-72} 
    	>IL8.H33A H33A Interleukin 8 Analog {6-72}
    	>TT Troponin-C III-III Homodimer {93-126,129-162}
    
    Make sure each name only contains one set of braces and that your amino acids in your corresponding shift files follow the numbering scheme you have indicated.

  4. Run the xalign program on the input file above to generate an alignment file. Make sure the multiple alignment results look reasonable.

  5. Finally, ensure all required PPM shift files are located in one directory. Use the file naming convention xxx.PPM where "xxx" is the sequence ID code from the xalign sequence input file. This is how orb maps a particular sequence in the alignment file to the correct PPM shift file.

    If you have installed orb, you can view an example homologous shift directory in:

    	$INSTALL/lib/orb/examples
    
Now you are set to run the program.


Using the Orb GUI

  1. Start the program by typing 'orb'.

    If you do not get a graphical window, check with your system administrator to make sure the program has been installed and is accessible to you. A common problem is that your PATH environment variable needs to be changed to include the location of the installed orb program.

    If you are logged in remotely, then enter the first command in the console window and the second in your remote login window:

    	xhost + remoteMachine
    	setenv DISPLAY hostMachine:0
    
    This allows orb to run on the remote machine but the display will go to the host computer.

  2. Select the desired orb function.

    You can predict shifts given:

    • sequence/structure only
    • sequence/structure plus homologous sequences

  3. Enter the alignment file (output from xalign).

    The program then displays the shifts directory field, the output fields, and finally a menu which indicates the sequence to predict. It is possible to get some pretty cryptic errors if you do not enter a valid xalign output file at this point.

  4. Enter the data directory containing homologous shift files.

    Hopefully you have remembered to put all your homologous shifts in one directory. The naming convention for each shift file is "ID.PPM" where ID is the sequence ID read in from the alignment file. The program then displays an "Options" button.

  5. (Optional) Enter the "Predict Output File" and the "Verbose Output File".

    You only need to do this if you do not like the default names or you want to use a descriptive name which corresponds to the sequence predicted. See the "Orb output" section to learn about the output files generated.

  6. Click on the sequence you want to predict.

    Note that orb selects the first sequence in the alignment file as the default.

  7. (Optional) Click on the "Options" button.

    Sometimes you may want to experiment with different combinations of shift files, or perhaps there are referencing issues, or perhaps you have biases for one shift file over another (which cannot be detected by using homology).

  8. Click the "Execute" button

    The program can take several seconds to run and produces several output files in the current directory. The "Display Results" button appears when the calculations are done.

  9. To see the results, click "Display Results"


Orb Output

The output of orb consists of these files (assuming the default filenames):

  1. orb.log - This is the first output file you want to look at to see if any errors were flagged.

  2. orb.verb - The power of the verbose output file is that it is easy to see how any one prediction is made. You can easily see the relationship and homologies between shift files and statistical tables. Although this file can be displayed in the gui, it is easier to find what you are looking for by using a program with scanning capabilities (eg, vi).

  3. orb.PPM - The predicted shifts are printed in PPM format with a few additional columns. First, the predicted standard deviation is given (see our paper for the calculation). Then we give the Wishart shift table values so that one can easily compare the shift values. The "Confidence" field was added at the end and basically it tries to distinguish those shifts which we are most confident in predicting. Confidence is denoted by the number of asterisks (up to 4) and a dash indicates a prediction where we only rely on table values.

If orb did not run cleanly, or the user aborts the gui prematurely, all the temporary files (tmp.*) are kept around in the current directory. These are likely not very useful to the average user and should be removed.


Algorithm

The predictions made by orb are based on homologous shift data files and statistical chemical shift information from David Wishart. The chemical shift database files are located in $INSTALL/lib/orb/dsw.* if you wish to view them.

The following criteria are considered when evaluating the applicability of previously assigned shifts to the new sequence:

How we weight the various factors is determined in $INSTALL/lib/orb/orb.parms. Other factors which may influence chemical shift prediction are currently beyond the scope of this program.

A complete explanation of the algorithm can be found in our paper (which is in process of publication).

Non-stereo specific assignments

Recently we modified orb to handle non-stereo specific assignments. First we make predictions based on all applicable stereo specific data and tables only, then we modify our predictions based on the best way to fit the non-stereo shifts to our predictions.

Orb is now smart enough to convert atom names of stereo specific shifts to non-stereo specific as demonstrated in this example:

	1:ASP_32:HB1          3.10
	1:ASP_32:HB2          3.10
		is converted to
	1:ASP_32:HB#          3.10

Non-homology Prediction Factors

So far orb can only make predictions based on homology. Sometimes a user knows that a particular set of shifts may be more/less applicable given the conditions of the experiment in which the shifts were derived. By selecting the "Options" button you can increase/decrease the shift bias multiplier for a given set of shifts. Then, once you have hit "Execute", check the verbose output file to see how your bias affects individual predictions. There is some amount of trial and error here.


Advanced Usage

  1. The fonts and colors for the orb gui are set in your "$HOME/.CamraDefaults" file. The orb program automatically copies the default one from $INSTALL/lib/orb/app-defaults if you do not have one. Although you could edit this "$HOME/.CamraDefaults" in a text editor, people seem to have more fun by running the colortool or fonttool which are available in the "Options" menu of the orb gui.

  2. The "orb.parms" file contains all the parameters which determine how orb makes its predictions. Items such as amino acid scoring matrices, weightings for factors such as global and local homology, and location of the chemical shift database are all determined by this file. Although the default settings should be reasonable, the user can try his/her own by simply copying the default parameters to the current directory. You can find the default parameters in:
    	$INSTALL/lib/orb/orb.parms
    
    The orb.parms file is fairly well documented, read this file to get an understanding for all the variables used in a prediction.

  3. Currently it is a fairly complex issue to explain how to change the parameter file if you are not satisfied with how the homologous sequences and/or database information is weighted.


Last modified: June 10, 1997

Robert Boyko - robert.boyko@ualberta.ca