The commands are roughly divided in five groups: 1) Database inspection commands. 2) Scanning (database search) commands. 3) Logical operations (making relations). 4) Evaluation of results (listings, graphics). 5) Other commands.Since there are many options, only a limited set is originally active in this menu. Use the command MORE to activate all options.
The main principle of this database is that you search for fixed length stretches of amino acids that have certain relations between all their stored parameters. These found stretches are stored in groups. These groups can be combined using logical operations like AND, OR, XOR, etc. They can also be visualized at the graphics. The length of the groups searched for can be set from 5 till 35 (with the SETLEN command).
The experienced user will see that there is some overlap between the groups described in this chapter and the DG*** groups described in the structure fragment chapter.
You will then be asked to give the `mismatch` parameter. This mismatch parameter tells WHAT IF how many positions in each hit are maximally allowed to be different from what was requested.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
You will then be asked to give the `mismatch` parameter. This mismatch parameter tells WHAT IF how many positions in each hit are maximally allowed to be different from what was requested.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
You will then be asked to give the `mismatch` parameter. This mismatch parameter tells WHAT IF how many positions in each hit are maximally allowed to be different from what was requested.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
You will then be asked to give the `mismatch` parameter. This mismatch parameter tells WHAT IF how many positions in each hit are maximally allowed to be different from what was requested.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
You will then be asked to give the `mismatch` parameter. This mismatch parameter tells WHAT IF how many positions in each hit are maximally allowed to be different from what was requested.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
You will then be asked to give the `mismatch` parameter. This mismatch parameter tells WHAT IF how many positions in each hit are maximally allowed to be different from what was requested.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
Examples: To find all C-terminal helical stretches one would combine a HELSHT run with SCNPOS with absolute range -1 till -2 (leaves one residue free at the end), and fractional range 0.0 till 1.0.
To find all Cysteines in C-terminal domains, one would combine a SEQUEN search with a SCNPOS search with absolute range 80 till 1000, and fractional range 0.5 till 1.0.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
You will then be asked to give the `mismatch` parameter. This mismatch parameter tells WHAT IF how many positions in each hit are maximally allowed to be different from what was requested.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
You will then be asked to give the `mismatch` parameter. This mismatch parameter tells WHAT IF how many positions in each hit are maximally allowed to be different from what was requested.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
You will then be asked to give the `mismatch` parameter. This mismatch parameter tells WHAT IF how many positions in each hit are maximally allowed to be different from what was requested.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
V L I M F W Y G A P S T C H R K Q E N D V 5 2 2 1 0-1 0-1 0-1-1 0-2-1-1-1-1-1-2-2 L 2 5 2 3 2 0 0-2-1-2-1 0-3-1-1-1 0-2-2-2 I 2 2 5 2 0 0 0-2-1-2-1 0-2-1-2-2-3-2-2-3 M 1 3 2 5 2-2-1-2 0-2-1 0 0 0-2-2 0-2-1-1 F 0 2 0 2 6 3 3-3-2-2-1-2-3 1-2-3-3-3-3-2 W-1 0 0-2 3 6 3-2-2-3 0-1-1 0 0-2-1-2-3-3 Y 0 0 0-1 3 3 6-3-2-3 0-2-2 1-1-2-2-1-1-2 G-1-2-2-2-3-2-3 5 0 0 0-1-2-1 0 0-1 0 0 0 A 0-1-1 0-2-2-2 0 5 1 1 0-2 0-1 0 0 1 0 0 P-1-2-2-2-2-3-3 0 1 5 0 0-3 0 0 0 0 1-2 0 S-1-1-1-1-1 0 0 0 1 0 5 2-1 0 1 0 1 1 2 0 T 0 0 0 0-2-1-2-1 0 0 2 5-1 1 0 0 0 1 0 0 C-2-3-2 0-3-1-2-2-2-3-1-1 6 0-2-3-3-3-2-2 H-1-1-1 0 1 0 1-1 0 0 0 1 0 5 2 1 1-1 1 1 R-1-1-2-2-2 0-1 0-1 0 1 0-2 2 5 2 2 0 0-2 K-1-1-2-2-3-2-2 0 0 0 0 0-3 1 2 5 1 1 1 0 Q-1 0-3 0-3-1-2-1 0 0 1 0-3 1 2 1 5 2 1 1 E-1-2-2-2-3-2-1 0 1 1 1 1-3-1 0 1 2 5 1 2 N-2-2-2-1-3-3-1 0 0-2 2 0-2 1 0 1 1 1 5 2 D-2-2-3-1-2-3-2 0 0 0 0 0-2 1-2 0 1 2 2 5This means that if you request a aspartic acid at a certain position in the search string, and say that the the score should be at least 2 points, that glutamic acid, asparagine and aspartic acid are acceptable at this position.
You will be prompted for the average Dayhoff scoring value first. This is simply the average of the scores for all positions in the search string. Thereafter you will one by one be prompted for the residue at each position in the search string, and its minimal Dayhoff score. If a certain residue is allowed to be anything, just give any residue and -100 or something very negative for the requested minimal score.
You will then be asked to give the `mismatch` parameter. This mismatch parameter tells WHAT IF how many positions in each hit are maximally allowed to be different from what was requested.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
I IP II IIP VIA VIB VIII IVaccording to the nomenclature of Wilmot and Thornton in J.Mol.Biol, (1988) 203, 221-232 (where P stands for ' or prime).
The following limitations will now be placed on the phi and psi angles of the residues I+1 and I+2:
PHI1 PSI1 PHI2 PSI2
I : TYPE I TURN ( -60 -30 -90 0)
IP : TYPE I` TURN ( 60 30 90 0)
II : TYPE II TURN ( -60 120 80 0)
IIP : TYPE II` TURN ( 60 -120 -80 0)
VIA : TYPE VIA TURN ( -60 120 -90 0)
VIB : TYPE VIB TURN (-120 120 -60 0)
VIII: TYPE VIII TURN ( -60 -30 -120 120)
IV : TYPE IV TURN ( ALL OTHERS)
Type IV is only there for completeness. Using it means that you
get nearly the whole database as result.
A turn consists of four residues. If your search fragment is longer than four residues, you will be asked to indicate which residue in the search fragment should be the first of the four turn positions. Obviously, this position in the fragment where the turn starts cannot be one of the last three positions in the fragment.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
WHAT IF has all hydrophobic moments for all proteins in the database on-line available for repeat angle 100 degrees and window width 7.
The command SCNHYD will make WHAT IF loop over the length of the search string, and for every position prompt you for the limits on the hydrophobic moment at this position. Here you have to give 2 values. All values are allowed. Normal values fall in the range 0.0 till 0.5.
You will then be asked to give the `mismatch` parameter. This mismatch parameter tells WHAT IF how many positions in each hit are maximally allowed to be different from what was requested.
After the very fast search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
In order to use this option, you have to type a lot. For every position in the search string, you will be prompted for the amino acid(s). If you give one amino acid, you can use the subsequent question about which atoms to use to specify individual atoms. If you give multiple amino acids, you can when asked for the atoms, only give `SIDE-CHAINS` or `BACK-BONE`. After the amino acid is known, the same questions as above will be repeated for the residues with which there should be a contact. The same kind of answers as for the residues searched for should be given. The last question per position in the search string is the contact distance. This is the distance between the atom centers minus the two Van der Waals radii. So for just touching atoms, give zero. Since the database does not contains hits where this distance is larger than 1.5 Angstrom it is useless (but not fatal) to give very large numbers. You can also give negative distances to detect `bumps`.
The last thing you will be prompted for is the database range. Just hit return to use the whole range. If you take the full (app. 100 proteins) database, then the average search will take roughly 20 seconds CPU on a VAX workstation.
After the search WHAT IF will tell you how many hits it found and ask you in which group you want to store these hits. The suggested default is just the first free group available. Be aware that there are at present only 10 groups allowed to be active at one time. If you do not want to store the hits, just give group number 0.
The option NEACON makes WHAT IF prompt you for the position in the fragment of the first residue. It will than ask for the sequence distance of the second residue. If, for example you answer these questions with 2 and 8 respectively fragments will be searched for that have a contact between the residues 2 and 10.
You will be asked to give a contact type and a cutoff distance. The allowed contact types are BB, BS, SB or SS, in which the first B or S stands for the first residue in the fragment, and the second B or S for the second residue in the fragment. The cutoff distance is, as usual, the distance between the Van der Waals radii that is still allowed for to atoms to be called in contact.
E.g. if you use 2 and 8 respectively for the residues, and SB for the contact type, fragments will be searched for that have a side chain atom in residue 2 in the fragment in contact with a backbone atom in residue 10.
* (asterix) if this residue is completely free. -2 if this residue should not be a cysteine. -1 if this residue should be an unpaired cysteine. N if this residue should be a paired cysteine.N can be zero if you don't care how far down in the sequence this cysteine should be. If you give for example 4, that means that you are going to search for cysteines that are paired with a cysteine that is four residues further in the same sequence. So, this is one of the exceptions where zero is valid input...
these groups can then be combined by means of logical operations.
These operations are AND OR NOT XOR. The user has to type SANDOR, to be able to use one of these options.
The user has several options available to look in or at groups. These options are SHOGRP (shows all groups made) SHOHIT shows hits in a group). Also the option INIGRP is available to clean groups. SETLEN can be used to vary the length of the stretches searched for.
AND OR NOT XORThese operations do the following:
AND creates a new group consisting of all hits that both groups on which it operates have in common.
OR creates a new group which consists of all hits that are present in at least one of the two groups on which it operates.
NOT creates a new group consisting of all hits that are present in one of the two groups on which it operates, but not in the other.
XOR creates a new group which consists of all hits that are present in the first of the two groups on which it operates, but not in the second. (I don't think that this operator will be used very often).
WARNING! This option only works as expected when the middle residues of all hits in a group are the same amino acid type (all alanines, or all cysteines, etc.).
The command SCNGRN makes WHAT IF prompt you for the number of a group. Thereafter you will be asked how many hits you want to see. At present you can give as many hits as you wish, but strange things will happen if all these hits together have more than 2500 amino acids in them. These hits will be coloured blue till red as function of their position in the hit list, all superimposed on the first structure (which will sometimes look strange), centered at the present center of the screen, and placed in a MOL-item. In order to superimpose the structures you will be asked which atoms to use for superpositioning. The parameter setting menu can be used to determine which parts of the hit and its environment will be shown.
The option needs a lot of input. You will be prompted for the atoms in the center residue that should make the contact, for the atoms in the central residue that should be used for superpositioning, for the neighboring residues, and for the atoms in the neighbors that make the contact. This seems somewhat redundant because you typed all this already for the previously run SCNCON option, but I have great plans for these options in the future, and when those are ready, you will understand why.
You will (after roughly 10 seconds CPU on a VAX workstation) be prompted for the number of the MOL-object, and the name of the MOL-item.
After all the input, all hits will be shown at the screen, with their entire environments present.
This option is not yet tested.
It is possible to do logical operations on groups that have stretches of different length in them. The program only looks at the first amino acid. If these are the same, meaning that it is the same amino acid at the same location in the same protein, then those stretches are for WHAT IF the same.
BIG TRP + TYR + PHE + HIS + ARG + LYS + MET SML GLY + ALA + SER POS ARG + LYS NEG GLU + ASP POL ARG + LYS + GLU + ASP + GLN + ASN + HISIf you want more, you should ask Gert Vriend, but ask very friendly, because it means at least an hour of work.
WHAT IF will now loop over all residues in the database that are of the same type as the central residue given. It will for each of these database hits superimpose (only using the atoms marked for superimposing) this residue on the central one, and apply the superposition transformation on the whole molecule in which the database hit resides. If there is now (in the rotated and translated database protein) a residue of the same type as the given neighbour residue approximately at the same place in space as the indicated neighbour, then this pair will be marked as a hit.
Don't worry about the stupidity of this algorithm. In reality it works a little bit different, but that is way to difficult to explain.
All hits found are stored in a group, send to the MOVIE area, and upon request send to a mol-item. This is since the neighboring information is not stored in the group, so if you later want to look at this contact group again, you will have to redo the whole option.
'Approximately being at the same place' is defined as the average distance between the equivalent atoms being less than a certain cutoff. The default value is 4 Angstrom. Use the PARAMS option to change this cutoff.
You will be prompted for a range and a minimal score. All stretches in that ranges will be compared with all stretches in the database. Every time that a hit is found that gives when compared with the stretch in your molecule no mutations that score below the Dayhoff cutoff given, one is added to the protein in which that hit was found. At the end a list with the number of hits per protein is shown.
HITBBF determines what to show of the hit itself.
0 = Show the three middle amino acid of the hit completely 1 = Show the whole hit completely. all aa, all atoms 2 = Show backbone of full hit plus side chain of center 3 = Draw only the tagged atoms for the center of hit 4 = Draw only side chain of middle aa of hit
HITBBF determines what to show of the contact partner.
0 = Show the residue completely 1 = Show only the side chain 2 = Show only the main chain 3 = Draw only the tagged atoms