CMSB tutorial 3: Protein Structures

David Gilbert

The aim of this lab is to give you practical experience in the concepts from the lecture on Protein structures

Resources

Some resources for protein structures are here.

Exercises

  1. Explore the following examples from the lecture, using RasMol. Specifically, learn how to display the protein using different Display and Colour options (try 'Colour by Structure').

    Initialisation for TOPS activities:

  2. Look up the same protein in both CATH and SCOP, for example 2bop:
    http://www.biochem.ucl.ac.uk/bsm/cath/
    http://scop.mrc-lmb.cam.ac.uk/scop/

  3. Lookup and View A Cartoon of 2bopA0

    From the buttons on the left hand side of the TOPS project website, choose the Cartoon Atlas button and fill in the Protein code as 2bop, the Chain as A, and the Domain Number as 0. In fact, you are not required to fill in the last two, as the lookup service will list all the protein chains and their domains, for a particular protein. Click on the view image to see the cartoon for this structure.

    If you are prepared to wait, you can also generate the same cartoon by submitting the PDB file of 2bop that you downloaded earlier. Go to the Cartoon Generation page of the TOPS site and fill in the form. You don't have to provide domain definitions for 2bop, since it is a single chain domain - however, if you submit larger structures, you may have to define the boundaries of its domains.

  4. Search for Structure Patterns

    Skip over Queries, which does not work at present, and go directly to the Pattern Search, which is actually linked to a server here in Glasgow. You should see a number of images of structures and a list of names. Clicking a name in the list should highlight the structure. These are not necessarily 'real' structures - that is to say, instances of these exact patterns may not exist. However, they have been chosen as very common patterns which are found as substructures in real proteins.

    Hopefully, you should be able to see the very close resemblance between the cartoon of 2bopA0 and one of the 'classic' patterns. If you click on 'Plait' in the list, the one you are looking at should be highlighted. You can search for other examples of the plait pattern by clicking the match button. The result should look like this:

    The yellow-highlighted triangles in the cartoons for each matching structure are the positions where the pattern matches. Helices are not highlighted as they are not part of the edges of the pattern and are not as important in the match. In fact, you may notice that there are a lot of extra SSEs in these matches that are not in the pattern. This is because the default set of structures to search through is the T-reps (the topology representatives), so what we have really asked is "which topologies contain the plait as a subtopology".

    If you go back to the search form, you can choose the CATH nreps (or, equivalently, the SCOP families) and find plaits with this subset. There will be more hits, and more of them.

  5. Compare a Structure To the Database

    Unfortunately, as you can see, the results of a match are not sorted by how similar they are to the pattern. To get this, we compare a structure to a set of other structures by finding the common pattern between an example and the structure of interest and computing the compression of the pair by the pattern. You may already be at the Structure Comparison page - if so, well done - if not, go there. Select the 2bop.pdb file to upload and select a subset of the database. To get the screenshot below, choose the SCOP superfamilies (it has 2bop as the top hit!).

    The top hits here are much more similar to the structure shown in the top right than the random sorting of the matches. The compression for 2bop to itself is (thankfully) maximum as 1. The more you scroll down the list, the smaller the compression, and the more extra bits there are. Note that there are two different effects that make the compression smaller - one is the size of the common pattern and the other is the extra SSEs in the examples.

    Although this is clearer with the CATH hreps, it should be obvious that many of the cartoons for the top hits are quite similar to that for 2bop.

  6. If you wish, download another PDB file and then submit it to the TOPS protein structure comparison server.
    E.g. Notes:
    (1) TOPS does not work well on (mostly) all-alpha structures. You should check in CATH or SCOP that your structure of interest is all-beta, or alpha-beta
    (2) You should submit only one domain of a structure. This can be done by

  7. Now see how other protein structure comparison servers behave with the same input files. Note that some of them will have pre-computed results for structures which are in the PDB... Example web services are:

  8. Protein structure design -- invent your own structures! The web-service for this is at http://balabio.dcs.gla.ac.uk/tops/advanced.html
    1. Submit the Top7a structure (a claimed synthetic novel fold) NeEhEhEeC 1:2A1:4A2:4R4:6R4:7A6:7A to the Structure design service -- is it a unique fold?
    2. Make up some more 'structures' based on the folds from slides of the lecture and submit them to the compare and match operations of the Structure design service in order to see what kind of results you get -- are they 'expected'?