ExUniProt-answers

From teachingmaterials

Jump to: navigation, search

Contents

Answers to "Exercise: Protein databases"

The numbers are found using UniProt on February 18, 2013

Simple text mining

QUESTION 1:

  1. How many hits do you find?
    1776
  2. How many hits are from Swiss-Prot? (tip: Click on "Show only reviewed")
    985
  3. Can you identify the correct hit (i.e. see which one is actually human insulin and not something else)?
    It's P01308 / INS_HUMAN (the very first hit).

QUESTION 2: How many hits are now left (still only in Swiss-Prot)?
741

QUESTION 3: How many hits are now left (still only in Swiss-Prot)?
58

QUESTION 4: How many hits are now left?
29

QUESTION 5:

  1. How did you do this?
    by adding "NOT name:receptor" to the query box.
  2. How many hits are now left?
    19

The content of Swiss-Prot

QUESTION 6:

  1. How many references are there?
    35
  2. Why do you think insulin is such a highly investigated protein?
    Because it is linked to a common and serious disease (diabetes) and used as a drug.

QUESTION 7:

  1. Where do you find insulin?
    It is secreted from the cell (this is indicated in Subcellular location under General annotation (Comments) and Cellular component under Ontologies).
  2. Why do you think is it found there?
    Because it is a hormone - it has to travel through the bloodstream to influence other cells.

QUESTION 8: How long is the signal peptide and the propeptide, respectively?
24 and 31 amino acids.

QUESTION 9: Which positions are in β-sheet conformation in insulin?
Positions 39-41, 46-50 and 74-76.

Note: As some of you may have noticed, positions 74-76 are within the propeptide and cannot participate in a beta-sheet in mature insulin. Insulin in its biologically active state is a homodimer (where each subunit consists of an A- and a B-chain), and the beta-sheet is formed between the two subunits, with positions 46-50 from each B-chain forming the strands.

Other databases linked from Swiss-Prot

No questions asked here.

Advanced search

QUESTION 10: How many proteins do you find?
51773

QUESTION 11: How many proteins do you find now?
15052

QUESTION 12: How many proteins do you find now?
1074

QUESTION 13: How many proteins are there in UniProt from Bacillus subtilis with the default TaxID [1423]? How many are there from Bacillus subtilis in total (all strains and subspecies)?
1623 and 39696, respectively

QUESTION 14: How many proteins of maximum length 10 do you find?
14096

QUESTION 15: How many proteins are now left?
1165

QUESTION 16: How many proteins are now left?
707

QUESTION 17: How many human non-fragment proteins of maximum length 10 do you find in UniProt?
5

QUESTION 18: Here they are in FASTA format:

>sp|P01358|GAJU_HUMAN Gastric juice peptide 1 OS=Homo sapiens PE=1 SV=1
LAAGKVEDSD
>sp|P02728|GLEM_HUMAN Erythrocyte membrane glycopeptide OS=Homo sapiens PE=1 SV=1
CEGHSHDHGA
>sp|P02729|GLUR_HUMAN Urine glycopeptide OS=Homo sapiens PE=1 SV=1
CEHSHDGA
>sp|P22103|PNEU_HUMAN Pneumadin OS=Homo sapiens PE=1 SV=1
AGEPKLDAGV
>sp|P01858|TUFT_HUMAN Phagocytosis-stimulating peptide OS=Homo sapiens PE=1 SV=1
TKPR
Personal tools