by Student Name Course Title & Code

PROTEIN SEQUENCE BY BIOINFORMATICS METHOD
Word count 3850
Abstract
Proteins usually mediate almost all of the biological processes.
Comprehending the mechanisms through which proteins function needs
knowledge of their 3D (three dimensional) structures. As a result of the
genome and entire-length cDNA sequencing tasks, there are different
orders of magnitude and more protein sequences evaluated with
experimentally established protein structures (Bishop & Rawlings, 1997.
Pp. 106). In bridging this data gap, there is a substantial impetus of
predicting precisely the structures of proteins from a given sequence
information. The prediction of protein structure using Bioinformatics
can entail sequence similarity search, multiple sequence alignments,
domain identification and characterization, secondary structure
prediction, or constructing 3D protein structures to atomic detail. The
initial step in any protein structure prediction entails establishing
whether a protein sequence or a section of a protein sequence contains
any structural homologs present in a Protein Data Bank (PDB) through the
use of sequence similarity search (Gromiha, 2010. Pp. 67).
Experimentally, protein structures are usually determined and grouped at
the domain level. Comparative molecular modeling emerges as the most
accurate and successful technique of predicting a protein structure
through the use of comparative modeling methods, it is capable to tell
if a section of all of a newly established sequence will adopt a
recognized fold, which exists in the Protein Data Bank. This assignment
will involve analyzing a protein sequence through the use of
Bioinformatics method then the likely function of the protein will be
discussed.
In this assignment, different steps will be used in analyzing the
protein sequence under consideration. The first step will involve using
the BLAST database in trying to identify the sequences that produce
significant alignments. Besides, through using the BLAST database, it
will be possible to obtain the specific hits and superfamilies. In using
the BLAST database, the protein sequence under consideration will be put
under the search engine in order for results to be indicated. The domain
hits will help in analyzing the protein sequence under consideration.
Another step will involve the use of the PROSITE in analyzing the
protein sequence this will help in identifying the domain hits in the
protein sequence under consideration. Besides, another step will involve
analyzing the protein sequence using the PSI-BLAST the PSI-BLAST will
help in searching sequence similarity. In addition, there will be a
deduction of the domain structure of the sequence from the results
indicated by the sequence searches. The analysis of protein sequence is
interesting since upon putting the protein sequence under consideration
in the databases, the results automatically generate themselves
producing hits that emerge from the search. Besides, the search is
interesting since all the matching sequence of the protein sequence
become automatically generated by the search databases. In addition, the
analysis of the protein sequence is interesting since it helps in
understanding the structure of the protein and its functioning.
Methods
In the protein sequence analysis, varied investigations will be carried
out. One of the investigations will entail finding a match for the
protein sequence under study through the use of different databases such
as the BLAST, PROSITE and PSI-BLAST. Besides, there will be a deduction
of the domain structure of the sequence from the results indicated by
the sequence searches.
Results and Discussion
From using the Bioinformatics method, different results were obtained
through the use of the different sequence searches. By using the BLAST
(NCBI and Uniprot), the following results were obtained:
Entry Entry Name Protein Names Organism Length Gene Names
P48960-2 CD07 HUMAN Isoform 2 of CD97 antigen Homo Sapiens (Human) 742
CD97
Alignment
      DMTFSTWTPPPGVHSQTLSRFFDKVQDLGRDSKTSSAEVTIQNVIKLVDELMEAPGDVEA
      1     DMTFSTWTPPPGVHSQTLSRFFDKVQDLGRDSKTSSAEVTIQNVIKLVD
ELMEAPGDVEA  
 DMTFSTWTPPPGVHSQTLSRFFDKVQDLGRDSKTSSAEVTIQNVIKLVDELMEAPGDVEA   224
  HYPERLINK “http://www.uniprot.org/uniprot/P48960-2” P48960-2
    LAPPVRHLIATQLLSNLEDIMRILAKSLPKGPFTYISPSNTELTLMIQERGDKNVTMGQS  
    61    LAPPVRHLIATQLLSNLEDIMRILAKSLPKGPFTYISPSNTELTLMIQERGDKN
VTMGQS   120  
 LAPPVRHLIATQLLSNLEDIMRILAKSLPKGPFTYISPSNTELTLMIQERGDKNVTMGQS   284
  HYPERLINK “http://www.uniprot.org/uniprot/P48960-2” P48960-2
      SARMKLNWAVAAGAEDPGPAVAGILSIQNMTTLLANASLNLHSKKQAELEEIYESSIRGV
      121   SARMKLNWAVAAGAEDPGPAVAGILSIQNMTTLLANASLNLHSKKQAELEE
IYESSIRGV   180 
   SARMKLNWAVAAGAEDPGPAVAGILSIQNMTTLLANASLNLHSKKQAELEEIYESSIRGV   
344  HYPERLINK “http://www.uniprot.org/uniprot/P48960-2” P48960-2
      QLRRLSAVNSIFLSHNNTKELNSPILFAFSHLESSDGEAGRDPPAKDVMPGPRQELLCAF
      181   QLRRLSAVNSIFLSHNNTKELNSPILFAFSHLESSDGEAGRDPPAKDVMPG
PRQELLCAF   240 
  QLRRLSAVNSIFLSHNNTKELNSPILFAFSHLESSDGEAGRDPPAKDVMPGPRQELLCAF   40
4  HYPERLINK “http://www.uniprot.org/uniprot/P48960-2” P48960-2
      WKSDSDRGGHWATEGCQVLGSKNGSTTCQCSHLSSFAILMAHYDVEDWKL     
 241   WKSDSDRGGHWATEGCQVLGSKNGSTTCQCSHLSSFAILMAHYDVEDWKL   290
    WKSDSDRGGHWATEGCQVLGSKNGSTTCQCSHLSSFAILMAHYDVEDWKL   454 
HYPERLINK “http://www.uniprot.org/uniprot/P48960-2” P48960-2
P48960-2, Isoform 2 of CD97 antigen, Homo sapiens Sequence
10 20 30 40 50 60
MGGRVFLAFC VWLTLPGAET QDSRGCARWC PQNSSCVNAT ACRCNPGFSS FSEIITTPTE
70 80 90 100 110 120
TCDDINECAT PSKVSCGKFS DCWNTEGSYD CVCSPGYEPV SGAKTFKNES ENTCQDVDEC
130 140 150 160 170 180
SSGQHQCDSS TVCFNTVGSY SCRCRPGWKP RHGIPNNQKD TVCEDMTFST WTPPPGVHSQ
190 200 210 220 230 240
TLSRFFDKVQ DLGRDSKTSS AEVTIQNVIK LVDELMEAPG DVEALAPPVR HLIATQLLSN
250 260 270 280 290 300
LEDIMRILAK SLPKGPFTYI SPSNTELTLM IQERGDKNVT MGQSSARMKL NWAVAAGAED
310 320 330 340 350 360
PGPAVAGILS IQNMTTLLAN ASLNLHSKKQ AELEEIYESS IRGVQLRRLS AVNSIFLSHN
370 380 390 400 410 420
NTKELNSPIL FAFSHLESSD GEAGRDPPAK DVMPGPRQEL LCAFWKSDSD RGGHWATEGC
430 440 450 460 470 480
QVLGSKNGST TCQCSHLSSF AILMAHYDVE DWKLTLITRV GLALSLFCLL LCILTFLLVR
490 500 510 520 530 540
PIQGSRTTIH LHLCICLFVG STIFLAGIEN EGGQVGLRCR LVAGLLHYCF LAAFCWMSLE
550 560 570 580 590 600
GLELYFLVVR VFQGQGLSTR WLCLIGYGVP LLIVGVSAAI YSKGYGRPRY CWLDFEQGFL
610 620 630 640 650 660
WSFLGPVTFI ILCNAVIFVT TVWKLTQKFS EINPDMKKLK KARALTITAI AQLFLLGCTW
670 680 690 700 710 720
VFGLFIFDDR SLVLTYVFTI LNCLQGAFLY LLHCLLNKKV REEYRKWACL VAGGSKYSEF
730 740
TSTTSGTGHN QTRALRASES GI
Discussion
By similarity, the likely function of the CD97 antigen is acting like a
receptor it is likely engaged in adhesion and signaling processes in
the early life after the activation of leukocyte. The protein plays an
exceedingly vital role in the migration of leukocyte. The domain
structure is such that the first two EGF domains link the interaction
with DAF. Besides, a 3rd tandemly positioned EGF domain is fundamental
for the structural reliability of the binding portion. In understanding
the sequence similarity, it is essential to note that it has 5 EGF-like
domains and 1 GPS domain therefore, it belongs to G-protein coupled
receptor 2 family and LN- TM7 subfamily. In addition, the protein forms
a heterodimer, comprising of a vast extracellular region non-covalently
connected to a seven trans-membrane moiety. It also interacts with a
matching decay-accelerating factor (DAF).
Multiple Alignments of the Sequence
1    MGGRVFLAFCVWLTLPGAETQDSRGCARWCPQNSSCVNATACRCNPGFSSFSEIITTPTE 
  60   HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN1    MGGRVFLAFCVWLTLPGAETQDSRGCARWCPQNSSCVNATACRCNPG
FSSFSEIITTPTE   60   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN1    MGGRVFLAFCVWLTLPGAETQDSRGCARWCPQNSSCVNATACRCNPGFSSF
SEIITTPTE   60   HYPERLINK “http://www.uniprot.org/uniprot/P48960”
l “P48960-3” P48960-3  CD97_HUMAN
     *********************************************************  
 61   TCDDINECATPSKVSCGKFSDCWNTEGSYDCVCSPGYEPVSGAKTFKNESENTCQDVDEC 
 120   HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN61   TCDDINECATPSKVSCGKFSDCWNTEGSYDCVCSPGYEPVSGAKTFKN
ESENTCQDV—  117   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN61   TCDDINECATPSKVSCGKFSDCWNTEGSYDCVCSPGYEPVSGAKTFKNESEN
TCQDVDEC  120   HYPERLINK “http://www.uniprot.org/uniprot/P48960”
l “P48960-3” P48960-3  CD97_HUMAN
                                    
                             121  QQNPRLC
KSYGTCVNTLGSYTCQCLPGFKFIPEDPKVCTDVNECTSGQNPCHSSTHCLNN  180  
HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN118  –
—  117   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN121  QQNPRLCKSYGTCVNTLGSYTCQCLPGFKFIPEDPKVCTDV
—  161   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
                                   **
****************************181  VGSYQCRCRPGWQPIPGSPNGPNNTVCEDVDECSSGQ
HQCDSSTVCFNTVGSYSCRCRPG  240   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN118  –DECSSGQHQCDSSTVCFNT
VGSYSCRCRPG  147   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN162  –DECSSGQHQCDSSTVCFNTVGSY
SCRCRPG  191   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
     ************************************************************24
1  WKPRHGIPNNQKDTVCEDMTFSTWTPPPGVHSQTLSRFFDKVQDLGRDSKTSSAEVTIQN  300
   HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN148  WKPRHGIPNNQKDTVCEDMTFSTWTPPPGVHSQTLSRFFDKVQDLGRDS
KTSSAEVTIQN  207   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN192  WKPRHGIPNNQKDTVCEDMTFSTWTPPPGVHSQTLSRFFDKVQDLGRDSKTSS
AEVTIQN  251   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
     ************************************************************30
1  VIKLVDELMEAPGDVEALAPPVRHLIATQLLSNLEDIMRILAKSLPKGPFTYISPSNTEL  360
   HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN208  VIKLVDELMEAPGDVEALAPPVRHLIATQLLSNLEDIMRILAKSLPKGP
FTYISPSNTEL  267   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN252  VIKLVDELMEAPGDVEALAPPVRHLIATQLLSNLEDIMRILAKSLPKGPFTYI
SPSNTEL  311   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
     ************************************************************36
1  TLMIQERGDKNVTMGQSSARMKLNWAVAAGAEDPGPAVAGILSIQNMTTLLANASLNLHS  420
   HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN268  TLMIQERGDKNVTMGQSSARMKLNWAVAAGAEDPGPAVAGILSIQNMTT
LLANASLNLHS  327   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN312  TLMIQERGDKNVTMGQSSARMKLNWAVAAGAEDPGPAVAGILSIQNMTTLLAN
ASLNLHS  371   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
     ************************************************************42
1  KKQAELEEIYESSIRGVQLRRLSAVNSIFLSHNNTKELNSPILFAFSHLESSDGEAGRDP  480
   HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN328  KKQAELEEIYESSIRGVQLRRLSAVNSIFLSHNNTKELNSPILFAFSHL
ESSDGEAGRDP  387   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN372  KKQAELEEIYESSIRGVQLRRLSAVNSIFLSHNNTKELNSPILFAFSHLESSD
GEAGRDP  431   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
     ************************************************************48
1  PAKDVMPGPRQELLCAFWKSDSDRGGHWATEGCQVLGSKNGSTTCQCSHLSSFAILMAHY  540
   HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN388  PAKDVMPGPRQELLCAFWKSDSDRGGHWATEGCQVLGSKNGSTTCQCSH
LSSFAILMAHY  447   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN432  PAKDVMPGPRQELLCAFWKSDSDRGGHWATEGCQVLGSKNGSTTCQCSHLSSF
AILMAHY  491   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
     ************************************************************54
1  DVEDWKLTLITRVGLALSLFCLLLCILTFLLVRPIQGSRTTIHLHLCICLFVGSTIFLAG  600
   HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN448  DVEDWKLTLITRVGLALSLFCLLLCILTFLLVRPIQGSRTTIHLHLCIC
LFVGSTIFLAG  507   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN492  DVEDWKLTLITRVGLALSLFCLLLCILTFLLVRPIQGSRTTIHLHLCICLFVG
STIFLAG  551   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
     ************************************************************60
1  IENEGGQVGLRCRLVAGLLHYCFLAAFCWMSLEGLELYFLVVRVFQGQGLSTRWLCLIGY  660
   HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN508  IENEGGQVGLRCRLVAGLLHYCFLAAFCWMSLEGLELYFLVVRVFQGQG
LSTRWLCLIGY  567   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN552  IENEGGQVGLRCRLVAGLLHYCFLAAFCWMSLEGLELYFLVVRVFQGQGLSTR
WLCLIGY  611   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
     ************************************************************66
1  GVPLLIVGVSAAIYSKGYGRPRYCWLDFEQGFLWSFLGPVTFIILCNAVIFVTTVWKLTQ  720
   HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN568  GVPLLIVGVSAAIYSKGYGRPRYCWLDFEQGFLWSFLGPVTFIILCNAV
IFVTTVWKLTQ  627   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN612  GVPLLIVGVSAAIYSKGYGRPRYCWLDFEQGFLWSFLGPVTFIILCNAVIFVT
TVWKLTQ  671   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
     ************************************************************72
1  KFSEINPDMKKLKKARALTITAIAQLFLLGCTWVFGLFIFDDRSLVLTYVFTILNCLQGA  780
   HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN628  KFSEINPDMKKLKKARALTITAIAQLFLLGCTWVFGLFIFDDRSLVLTY
VFTILNCLQGA  687   HYPERLINK
“http://www.uniprot.org/uniprot/P48960” l “P48960-2” P48960-2
 CD97_HUMAN672  KFSEINPDMKKLKKARALTITAIAQLFLLGCTWVFGLFIFDDRSLVLTYVFTI
LNCLQGA  731   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
     *******************************************************781  
FLYLLHCLLNKKVREEYRKWACLVAGGSKYSEFTSTTSGTGHNQTRALRASESGI  835  
HYPERLINK “http://www.uniprot.org/uniprot/P48960” P48960
   CD97_HUMAN688  FLYLLHCLLNKKVREEYRKWACLVAGGSKYSEFTSTTSGTGHNQTRALR
ASESGI  742   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-2” P48960-2
 CD97_HUMAN732  FLYLLHCLLNKKVREEYRKWACLVAGGSKYSEFTSTTSGTGHNQTRALRASES
GI  786   HYPERLINK “http://www.uniprot.org/uniprot/P48960” l
“P48960-3” P48960-3  CD97_HUMAN
1.Chain A, Crystal Structure Of The Gain And Hormr Domains Of Cirl
1LATROPHILIN 1 (Cl1)
Sequence pdb|4DLQ|A Length: 381Number of Matches: 1
Score Expect Method Identities Positives Gaps
48.1 bits(113) 3e-06 Compositional matrix adjust. 28/92(30%) 44/92(47%)
22/92(23%)
VNSIFLSHNNTKE-LNSPILFAFSHLESSDGEAGRDPPAKDVMPGPRQELLCAFWK 242
VNS ++ + KE L P++F +HLE+ + C+FW
VNSQVIAASINKESSRVFLMDPVIFTVAHLEAKNHFNANCSFWN 349
-SDSDRGGHWATEGCQVLGSKNGSTTCQCSHL 273
S+ G+W+T+GC+++ S TTC CSHL
YSERSMLGYWSTQGCRLVESNKTHTTCACSHL 381
Related structure
Regions of Similarity Between the sequences and Other Sequences Using
BLAST
Score Expect Method Identities Positives Gaps
79.7 bits(195) 3e-16 Compositional matrix adjust. 102/416(25%)
165/416(39%) 89/416(21%)
ADPEVRRCSEQRCPAP-YEICPEDY-LMSMVWKRTPAGDLAFNQCPLNATGTTSRR 54
ADP S +R PAP + PE + + + W T G L CP G S +
ADPPAP–STRRPPAPNLHVSPELFCEPREVRRVQWPATQQGMLVERPCPKGTRGIASFQ 58
CSLSLHGVAFW–EQPSFARCISNEYRHLQHSIKEHLAKGQRMLAGDGMSQVTKTLLDLT 112
C L + W P + C S + IK +G+ + + L T
C—LPALGLWNPRGPDLSNCTSPWVNQVAQKIK–SGENAANIASELARHT 105
QRKNFYAGDLLMSVEILRNVTDTF–KRASY—IPASDGVQNF 151
R + YAGD+ SV+++ + D +R S D ++
RGSIYAGDVSSSVKLMEQLLDILDAQLQALRPIERESAGKNYNKMHKRERTCKDYIKAV 164
FQIVSNLLDEENKEKWED—AQQIYPGSIELMQVIEDFIHIVGMGMMDFQNSYLMTGNV 208
+ V NLL E E W+D +Q++ ++ L+ V+E+ ++ + + NV
VETVDNLLRPEALESWKDMNATEQVHTATM-LLDVLEEGAFLLADNVREPARFLAAKQNV 223
VASIQKLPAASVLTDINFPMKGRKGMVDWARNSEDRVVIPKSIFTPVSSKELDESSVFVL 268
V + L + ++ FP ++A SE + + + K+ + V +
VLEVTVLSTEGQVQELVFPQ—EYA–SESSIQLSANTI-KQNSRNGVVKV 269
GAVLYKNLDLILPTLRNYTV-INSKIIVVTIRPEPKTTDSFLE- 310
+LY NL L L T N TV +NS++I +I E ++ FL
VFILYNNLGLFLST-ENATVKLAGEAGTGGPGGASLVVNSQVIAASINKE–SSRVFLMD 326
—IELAHL-ANGTLNPYCVLWDDSKTNESLGTWSTQGCKTVLTDASHTKCLCDRL 362
+AHL A N C W+ S+ + LG WSTQGC+ V ++ +HT C C L
PVIFTVAHLEAKNHFNANCSFWNYSERS-MLGYWSTQGCRLVESNKTHTTCACSHL 381
2.Chain A, Crystal Structure Of The Gain And Hormr Domains Of Brain
Angiogenesis Inhibitor 3 (Bai3)
Sequence pdb|4DLO|A Length: 382Number of Matches: 1
Score Expect Method Identities Positives Gaps
38.5 bits(88) 0.004 Compositional matrix adjust. 16/44(36%) 24/44(54%)
2/44(4%)
CAFWKSD–SDRGGHWATEGCQVLGSKNGSTTCQCSHLSSFAIL 279
C W ++ G W+T+GC+ + + T C C LS+FAIL
CVLWDDSKTNESLGTWSTQGCKTVLTDASHTKCLCDRLSTFAIL 368
Related Structure
Regions of Similarity Between the sequences and Other Sequences Using
BLAST
Score Expect Method Identities Positives Gaps
806 bits(2081) 0.0 Compositional matrix adjust. 382/382(100%)
382/382(100%) 0/382(0%)
ADPEVRRCSEQRCPAPYEICPEDYLMSMVWKRTPAGDLAFNQCPLNATGTTSRRCSLSLH 60
ADPEVRRCSEQRCPAPYEICPEDYLMSMVWKRTPAGDLAFNQCPLNATGTTSRRCSLSLH
ADPEVRRCSEQRCPAPYEICPEDYLMSMVWKRTPAGDLAFNQCPLNATGTTSRRCSLSLH 60
GVAFWEQPSFARCISNEYRHLQHSIKEHLAKGQRMLAGDGMSQVTKTLLDLTQRKNFYAG 120
GVAFWEQPSFARCISNEYRHLQHSIKEHLAKGQRMLAGDGMSQVTKTLLDLTQRKNFYAG
GVAFWEQPSFARCISNEYRHLQHSIKEHLAKGQRMLAGDGMSQVTKTLLDLTQRKNFYAG 120
DLLMSVEILRNVTDTFKRASYIPASDGVQNFFQIVSNLLDEENKEKWEDAQQIYPGSIEL 180
DLLMSVEILRNVTDTFKRASYIPASDGVQNFFQIVSNLLDEENKEKWEDAQQIYPGSIEL
DLLMSVEILRNVTDTFKRASYIPASDGVQNFFQIVSNLLDEENKEKWEDAQQIYPGSIEL 180
MQVIEDFIHIVGMGMMDFQNSYLMTGNVVASIQKLPAASVLTDINFPMKGRKGMVDWARN 240
MQVIEDFIHIVGMGMMDFQNSYLMTGNVVASIQKLPAASVLTDINFPMKGRKGMVDWARN
MQVIEDFIHIVGMGMMDFQNSYLMTGNVVASIQKLPAASVLTDINFPMKGRKGMVDWARN 240
SEDRVVIPKSIFTPVSSKELDESSVFVLGAVLYKNLDLILPTLRNYTVINSKIIVVTIRP 300
SEDRVVIPKSIFTPVSSKELDESSVFVLGAVLYKNLDLILPTLRNYTVINSKIIVVTIRP
SEDRVVIPKSIFTPVSSKELDESSVFVLGAVLYKNLDLILPTLRNYTVINSKIIVVTIRP 300
EPKTTDSFLEIELAHLANGTLNPYCVLWDDSKTNESLGTWSTQGCKTVLTDASHTKCLCD 360
EPKTTDSFLEIELAHLANGTLNPYCVLWDDSKTNESLGTWSTQGCKTVLTDASHTKCLCD
EPKTTDSFLEIELAHLANGTLNPYCVLWDDSKTNESLGTWSTQGCKTVLTDASHTKCLCD 360
RLSTFAILAQQPREHHHHHHHH 382
RLSTFAILAQQPREHHHHHHHH
RLSTFAILAQQPREHHHHHHHH 382
Through the use of the Conserved Domain (CD) search, conserved domains
in the protein sequences that can be compared against conserved domain
models can be identified the specific hits identify smart00008: HormR.
This is a domain, which is present in hormone receptors. This
extracellular domain has four conserved cysteines, which are likely for
disulphide bridges. This domain is usually found in a variety of hormone
receptors (Marchler-Bauer et al, 2009. Pp. 208). Besides, it may be a
ligand binding domain. Another conserved domain is pfam12003 (PSSMID)
this domain is of an unknown function (DUF3497). It is presumed that the
domain is functionally uncharacterized and is usually found in
eukaryotes. The domain is between 213-257 amino acids in length.
Besides, the domain has a single entirely conserved residue W, which may
be functionally essential and is usually associated with pfam00002,
pfam01825 and pfam02793.
By using the PROSITE, the results indicate that there is only one hit in
one sequence. By similarity, the sequence is
ELLCAFWKSDSDrGGHWATEGCQVLGSKNGSTTCQCSHLSSFAILMAHYDV. The alignments
include Chain A, Crystal Structure Of The Gain And Hormr Domains Of
Cirl 1LATROPHILIN 1 (Cl1), Chain A, Crystal Structure Of The Gain And
Hormr Domains Of Brain Angiogenesis Inhibitor 3 (Bai3), Chain C, Carbon
Monoxide Dehydrogenase From Hydrogenophaga Pseudoflava, Chain C, Carbon
Monoxide Dehydrogenase From Hydrogenophaga Pseudoflava Which Lacks The
Mo-Pyranopterin Moiety Of The Molybdenum Cofactor, Chain A, Complex Of
Arl2 And Bart, Crystal Form 1, and Chain X, Reductive Activator For
Corrinoid,Iron-Sulfur Protein. The related structures for the alignments
are as below
3D Structure of Chain C, Carbon Monoxide Dehydrogenase From
Hydrogenophaga Pseudoflava
3D structure of Chain C, Carbon Monoxide Dehydrogenase From
Hydrogenophaga Pseudoflava Which Lacks The Mo-Pyranopterin Moiety Of The
Molybdenum Cofactor
3D structure Chain A, Complex Of Arl2 And Bart
3D structure of Chain X, Reductive Activator For Corrinoid,Iron-Sulfur
Protein
Besides, through using the Conserved Domain search and comparing the
protein sequence alignment with other conserved domain models, it is
found that smart00303, G-protein-coupled receptor proteolytic site
domain is present. The receptor is present in latrophilin/CL-1, sea
urchin REJ and polycystin.
On the other hand, through searching sequence similarity using
PSI-BLAST, different protein sequences result. The table below shows
results for PSI-BLAST indicating proteins with sequence similarity.
DB: ID Source Length
TR:G7NLZ6_MACMU Leukocyte antigen CD97 OS=Macaca mulatta GN=EGK_10213
PE=4 SV=1 835
TR:F7DQH2_HORSE Uncharacterized protein (Fragment) OS=Equus caballus
GN=LOC100064396 PE=4 SV=1 (Fragment) OS=Equus caballus GN=LOC100064396
PE=4 SV=1 773
TR:F6TY23_MACMU
Uncharacterized protein (Fragment) OS=Macaca mulatta GN=CD97 PE=4 SV=1
737
TR:F6TY46_MACMU Uncharacterized protein (Fragment) OS=Macaca mulatta
GN=CD97 PE=4 SV=1 786
TR:F6TY05_MACMU Uncharacterized protein (Fragment) OS=Macaca mulatta
GN=CD97 PE=4 SV=1 693
TR:H9ZFU9_MACMU CD97 antigen isoform 3 preproprotein OS=Macaca mulatta
GN=CD97 PE=2 SV=1 786
TR:H9ZFU8_MACMU CD97 antigen isoform 2 preproprotein OS=Macaca mulatta
GN=CD97 PE=2 SV=1 742
TR:G1M4E9_AILME Uncharacterized protein OS=Ailuropoda melanoleuca PE=4
SV=1 828
TR:K7CQL5_PANTR CD97 molecule OS=Pan troglodytes GN=CD97 PE=2 SV=1 786
TR:K7CNM3_PANTR CD97 molecule OS=Pan troglodytes GN=CD97 PE=2 SV=1 835
TR:K7CHK7_PANTR CD97 molecule OS=Pan troglodytes GN=CD97 PE=2 SV=1 742
TR:K7C1Z3_PANTR CD97 molecule OS=Pan troglodytes GN=CD97 PE=2 SV=1 835
TR:K7BJE5_PANTR CD97 molecule OS=Pan troglodytes GN=CD97 PE=2 SV=1 742
TR:K7B6B1_PANTR CD97 molecule OS=Pan troglodytes GN=CD97 PE=2 SV=1 786
TR:K6ZUD7_PANTR CD97 molecule OS=Pan troglodytes GN=CD97 PE=2 SV=1 835
TR:H2R6B9_PANTR Uncharacterized protein OS=Pan troglodytes GN=CD97 PE=4
SV=1 719
TR:B4DTS6_HUMAN cDNA FLJ54117, highly similar to CD97 antigen OS=Homo
sapiens PE=2 SV=1 760
TR:G7PZL5_MACFA Putative uncharacterized protein (Fragment) OS=Macaca
fascicularis GN=EGM_09357 PE=4 SV=1 559
TR:F6T1C0_HORSE Uncharacterized protein (Fragment) OS=Equus caballus
GN=LOC100064396 PE=4 SV=1 768
TR:B3KUI0_HUMAN cDNA FLJ39945 fis, clone SPLEN2023977, highly similar to
Homo sapiens CD97 antigen (CD97), transcript variant 2, mRNA OS=Homo
sapiens PE=2 SV=1 707
SP:CD97_HUMAN CD97 antigen OS=Homo sapiens GN=CD97 PE=1 SV=4 835
SP:P48960-3 Isoform 3 of CD97 antigen OS=Homo sapiens GN=CD97 786
SP:P48960-2 Isoform 2 of CD97 antigen OS=Homo sapiens GN=CD97 742
TR:B4E336_HUMAN cDNA FLJ53287, highly similar to CD97 antigen OS=Homo
sapiens PE=2 SV=1 655
TR:S7N1Z0_MYOBR EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Myotis brandtii GN=D623_10001203 PE=4 SV=1 523
TR:F7DBB4_HORSE Uncharacterized protein (Fragment) OS=Equus caballus
GN=LOC100064396 PE=4 SV=1 755
TR:G1QFE0_MYOLU Uncharacterized protein (Fragment) OS=Myotis lucifugus
PE=4 SV=1 834
TR:H2NXT0_PONAB Uncharacterized protein OS=Pongo abelii GN=CD97 PE=4
SV=1 835
SP:EMR2_HUMAN EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Homo sapiens GN=EMR2 PE=1 SV=2 823
SP:Q9UHX3-5 Isoform 5 of EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Homo sapiens GN=EMR2 681
SP:Q9UHX3-3 Isoform 3 of EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Homo sapiens GN=EMR2 774
SP:Q9UHX3-4 Isoform 4 of EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Homo sapiens GN=EMR2 730
SP:EMR2_CANFA EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Canis familiaris GN=EMR2 PE=2 SV=2 830
TR:L7N0H3_CANFA EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Canis familiaris GN=EMR2 PE=4 SV=1 652
TR:F6XAF8_CANFA EGF-like module-containing mucin-like hormone
receptor-like 2 (Fragment) OS=Canis familiaris GN=EMR2 PE=4 SV=1 831
TR:F1PV88_CANFA EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Canis familiaris GN=EMR2 PE=4 SV=2 841
TR:E2QV86_CANFA Uncharacterized protein OS=Canis familiaris PE=4 SV=1
847
TR:A8K6W1_HUMAN cDNA FLJ77470, highly similar to Homo sapiens egf-like
module containing, mucin-like, hormone receptor-like 2 (EMR2),
transcript variant 1, mRNA OS=Homo sapiens PE=2 SV=1 823
TR:A0JNV7_HUMAN Egf-like module containing, mucin-like, hormone
receptor-like 2 OS=Homo sapiens GN=EMR2 PE=2 SV=1 823
TR:F7IHK9_CALJA Uncharacterized protein OS=Callithrix jacchus GN=CD97
PE=4 SV=1 837
TR:G3RLY6_GORGO Uncharacterized protein OS=Gorilla gorilla gorilla
GN=101140763 PE=4 SV=1 819
TR:G3RBI2_GORGO Uncharacterized protein OS=Gorilla gorilla gorilla
GN=101140763 PE=4 SV=1 826
TR:G1M481_AILME Uncharacterized protein (Fragment) OS=Ailuropoda
melanoleuca PE=4 SV=1 570
TR:F7IHI8_CALJA Uncharacterized protein (Fragment) OS=Callithrix jacchus
GN=CD97 PE=4 SV=1 836
TR:F1PZQ0_CANFA Uncharacterized protein (Fragment) OS=Canis familiaris
PE=4 SV=2 810
TR:H0X3J3_OTOGA Uncharacterized protein (Fragment) OS=Otolemur garnettii
GN=EMR2 PE=4 SV=1 612
TR:G1P9G8_MYOLU Uncharacterized protein (Fragment) OS=Myotis lucifugus
PE=4 SV=1 831
TR:G1LYI7_AILME Uncharacterized protein (Fragment) OS=Ailuropoda
melanoleuca PE=4 SV=1 579
TR:G3T5R9_LOXAF Uncharacterized protein (Fragment) OS=Loxodonta africana
PE=4 SV=1 676
TR:G1QCM5_MYOLU Uncharacterized protein (Fragment) OS=Myotis lucifugus
PE=4 SV=1 826
TR:Q2Q423_CANFA CD97 large isoform OS=Canis familiaris GN=CD97 PE=2 SV=1
831
TR:K9J617_DESRO Putative g protein-coupled receptor (Fragment)
OS=Desmodus rotundus PE=2 SV=1 821
TR:I3NFE0_SPETR Uncharacterized protein (Fragment) OS=Spermophilus
tridecemlineatus PE=4 SV=1 804
TR:H0X2B6_OTOGA Uncharacterized protein (Fragment) OS=Otolemur garnettii
GN=CD97 PE=4 SV=1 834
TR:M3W4G3_FELCA Uncharacterized protein (Fragment) OS=Felis catus PE=4
SV=1 827
TR:I3MLM1_SPETR Uncharacterized protein (Fragment) OS=Spermophilus
tridecemlineatus GN=CD97 PE=4 SV=1 818
TR:F6XAC4_CANFA Uncharacterized protein (Fragment) OS=Canis familiaris
PE=4 SV=1 737
TR:F1PZP9_CANFA Uncharacterized protein OS=Canis familiaris PE=4 SV=2
712
TR:E2QV82_CANFA Uncharacterized protein OS=Canis familiaris PE=4 SV=1
750
TR:F6RMT8_HORSE Uncharacterized protein (Fragment) OS=Equus caballus
GN=LOC100064454 PE=4 SV=1 820
TR:D2HFZ3_AILME Putative uncharacterized protein (Fragment)
OS=Ailuropoda melanoleuca GN=PANDA_009903 PE=4 SV=1 806
SP:Q9UHX3-2 Isoform 2 of EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Homo sapiens GN=EMR2 812
TR:I3N3J8_SPETR Uncharacterized protein (Fragment) OS=Spermophilus
tridecemlineatus PE=4 SV=1 806
TR:Q95L62_PIG CD97 antigen OS=Sus scrofa PE=2 SV=1 732
TR:F1SD42_PIG Uncharacterized protein OS=Sus scrofa GN=CD97 PE=2 SV=2
744
TR:M0R0K5_HUMAN EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Homo sapiens GN=EMR2 PE=2 SV=1 831
TR:F7IHJ1_CALJA Uncharacterized protein (Fragment) OS=Callithrix jacchus
GN=CD97 PE=4 SV=1 777
TR:Q53GP0_HUMAN Egf-like module containing, mucin-like, hormone
receptor-like sequence 2 isoform e variant (Fragment) OS=Homo sapiens
PE=2 SV=1 810
SP:EMR2_MACMU EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Macaca mulatta GN=EMR2 PE=2 SV=1 822
TR:L5K7J5_PTEAL CD97 antigen OS=Pteropus alecto GN=PAL_GLEAN10002828
PE=4 SV=1 754
TR:L8HP58_BOSMU EGF-like module-containing mucin-like hormone
receptor-like 2 (Fragment) OS=Bos grunniens mutus GN=M91_20249 PE=4 SV=1
677
SP:CD97_BOVIN CD97 antigen OS=Bos taurus GN=CD97 PE=2 SV=1 734
TR:Q702I4_BOVIN CD97 antigen transcript variant (Precursor) OS=Bos
taurus GN=cd97 PE=2 SV=1 827
TR:L8J1B1_BOSMU CD97 antigen OS=Bos grunniens mutus GN=M91_00844 PE=4
SV=1 827
TR:F1MCN3_BOVIN CD97 antigen OS=Bos taurus GN=CD97 PE=2 SV=1 827
TR:A6QNW8_BOVIN CD97 antigen OS=Bos taurus GN=CD97 PE=2 SV=1 734
TR:L5LJM9_MYODS CD97 antigen OS=Myotis davidii GN=MDA_GLEAN10004118 PE=4
SV=1 658
TR:S7PBI0_MYOBR EGF-like module-containing mucin-like hormone
receptor-like 2 (Fragment) OS=Myotis brandtii GN=D623_10002688 PE=4 SV=1
664
TR:L5LRW1_MYODS EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Myotis davidii GN=MDA_GLEAN10000428 PE=4 SV=1 461
TR:S7MSX6_MYOBR CD97 antigen OS=Myotis brandtii GN=D623_10002701 PE=4
SV=1 613
TR:G5BYS9_HETGA CD97 antigen OS=Heterocephalus glaber GN=GW7_16049 PE=4
SV=1 907
SP:Q9UHX3-6 Isoform 6 of EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Homo sapiens GN=EMR2 765
TR:L8YGX4_TUPCH EGF-like module-containing mucin-like hormone
receptor-like 2 OS=Tupaia chinensis GN=TREES_T100001162 PE=3 SV=1 1071
TR:Q9DC42_MOUSE CD97 antigen OS=Mus musculus GN=Cd97 PE=2 SV=1 722
TR:Q4FJS6_MOUSE Cd97 protein OS=Mus musculus GN=Cd97 PE=2 SV=1 724
TR:E9QMJ5_MOUSE CD97 antigen OS=Mus musculus GN=Cd97 PE=2 SV=1 773
TR:E9QJS7_MOUSE CD97 antigen OS=Mus musculus GN=Cd97 PE=2 SV=1 818
SP:CD97_MOUSE CD97 antigen OS=Mus musculus GN=Cd97 PE=1 SV=2 818
SP:Q9Z0M6-3 Isoform 3 of CD97 antigen OS=Mus musculus GN=Cd97 773
SP:Q9Z0M6-2 Isoform 2 of CD97 antigen OS=Mus musculus GN=Cd97 724
TR:Q5XI36_RAT CD97 molecule OS=Rattus norvegicus GN=Cd97 PE=2 SV=1 825
TR:S9XQ27_9CETA CD97 antigen (Fragment) OS=Camelus ferus
GN=CB1_000077011 PE=4 SV=1 546
SP:LPHN3_RAT Latrophilin-3 OS=Rattus norvegicus GN=Lphn3 PE=2 SV=1 1550
SP:LPHN3_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=3 1537
SP:LPHN3_BOVIN Latrophilin-3 OS=Bos taurus GN=LPHN3 PE=2 SV=1 1580
SP:Q9Z173-7 Isoform 7 of Latrophilin-3 OS=Rattus norvegicus GN=Lphn3
1273
SP:Q9Z173-6 Isoform 6 of Latrophilin-3 OS=Rattus norvegicus GN=Lphn3
1341
SP:Q9Z173-5 Isoform 5 of Latrophilin-3 OS=Rattus norvegicus GN=Lphn3
1230
SP:Q9Z173-4 Isoform 4 of Latrophilin-3 OS=Rattus norvegicus GN=Lphn3
1298
SP:Q9Z173-3 Isoform 3 of Latrophilin-3 OS=Rattus norvegicus GN=Lphn3
1527
SP:Q9Z173-2 Isoform 2 of Latrophilin-3 OS=Rattus norvegicus GN=Lphn3
1459
SP:Q80TS3-5 Isoform 5 of Latrophilin-3 OS=Mus musculus GN=Lphn3 1528
SP:Q80TS3-4
Isoform 4 of Latrophilin-3 OS=Mus musculus GN=Lphn3
1298
SP:Q80TS3-3 Isoform 3 of Latrophilin-3 OS=Mus musculus GN=Lphn3 1543
SP:Q80TS3-2 Isoform 2 of Latrophilin-3 OS=Mus musculus GN=Lphn3 1342
SP:O97827-12 Isoform 12 of Latrophilin-3 OS=Bos taurus GN=LPHN3 1308
SP:O97827-11 Isoform 11 of Latrophilin-3 OS=Bos taurus GN=LPHN3 1299
SP:O97827-10 Isoform 10 of Latrophilin-3 OS=Bos taurus GN=LPHN3 1351
SP:O97827-9 Isoform 9 of Latrophilin-3 OS=Bos taurus GN=LPHN3 1342
SP:O97827-8 Isoform 8 of Latrophilin-3 OS=Bos taurus GN=LPHN3 1283
SP:O97827-7 Isoform 7 of Latrophilin-3 OS=Bos taurus GN=LPHN3 1571
SP:O97827-6 Isoform 6 of Latrophilin-3 OS=Bos taurus GN=LPHN3 1240
SP:O97827-5 Isoform 5 of Latrophilin-3 OS=Bos taurus GN=LPHN3 1231
SP:O97827-4
Isoform 4 of Latrophilin-3 OS=Bos taurus GN=LPHN3
1274
SP:O97827-3 Isoform 3 of Latrophilin-3 OS=Bos taurus GN=LPHN3 1503
SP:O97827-2 Isoform 2 of Latrophilin-3 OS=Bos taurus GN=LPHN3 1512
TR:K3W4M8_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1543
TR:F1LRG7_RAT Latrophilin-3 OS=Rattus norvegicus GN=Lphn3 PE=4 SV=2 1550
TR:D4AAL4_RAT Latrophilin-3 OS=Rattus norvegicus GN=Lphn3 PE=4 SV=2 1527
TR:D3ZH59_RAT Latrophilin-3 OS=Rattus norvegicus GN=Lphn3 PE=4 SV=2 1341
TR:D3Z6J2_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1240
TR:D3Z6H9_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1299
TR:D3Z6H7_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1351
TR:D3Z634_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1460
TR:D3Z5M6_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1580
TR:D3Z593_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1503
TR:D3Z4V0_MOUSE
Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1
1274
TR:D3Z4S7_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1571
TR:D3Z3X6_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1308
TR:D3Z3G4_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1512
TR:D3YWR3_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1268
TR:D3YWB1_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1283
TR:D3YVT9_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1550
TR:D3YU23_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1231
TR:D3YTW7_MOUSE Latrophilin-3 OS=Mus musculus GN=Lphn3 PE=2 SV=1 1534
TR:S7MXZ8_MYOBR Latrophilin-3 OS=Myotis brandtii GN=D623_10028067 PE=4
SV=1 1543
TR:L8IIK9_BOSMU
Latrophilin-3 (Fragment) OS=Bos grunniens mutus GN=M91_07240 PE=4 SV=1
1381
TR:G3N0B0_BOVIN Latrophilin-3 (Fragment) OS=Bos taurus GN=LPHN3 PE=2
SV=1 1072
TR:F1N7Z7_BOVIN Latrophilin-3 (Fragment) OS=Bos taurus GN=LPHN3 PE=2
SV=2 1152
TR:F1MMK6_BOVIN Latrophilin-3 (Fragment) OS=Bos taurus GN=LPHN3 PE=2
SV=2 1372
TR:F1MGL8_BOVIN
Latrophilin-3 (Fragment) OS=Bos taurus GN=LPHN3 PE=2 SV=1
1381
TR:R0J8K0_ANAPL Latrophilin-2 (Fragment) OS=Anas platyrhynchos
GN=Anapl_17421 PE=4 SV=1 1458
SP:Q9HAR2-2 Isoform 2 of Latrophilin-3 OS=Homo sapiens GN=LPHN3 1240
TR:E9PE04_HUMAN Latrophilin-3 OS=Homo sapiens GN=LPHN3 PE=2 SV=1 1469
TR:E9PBG4_HUMAN Latrophilin-3 OS=Homo sapiens GN=LPHN3 PE=2 SV=1 1503
TR:E7EX52_HUMAN Latrophilin-3 OS=Homo sapiens GN=LPHN3 PE=2 SV=1 1342
TR:E7EW95_HUMAN
Latrophilin-3 OS=Homo sapiens GN=LPHN3 PE=2 SV=1
1512
TR:E7EVD6_HUMAN Latrophilin-3 OS=Homo sapiens GN=LPHN3 PE=2 SV=1 1528
TR:E7EUW2_HUMAN Latrophilin-3 OS=Homo sapiens GN=LPHN3 PE=2 SV=1 1580
TR:E7EUP0_HUMAN
Latrophilin-3 OS=Homo sapiens GN=LPHN3 PE=2 SV=1
1308
TR:E7ETE3_HUMAN Latrophilin-3 OS=Homo sapiens GN=LPHN3 PE=2 SV=1 1571
The protein sequence analysis indicates that polycystin is present in
the protein sequence. The likely function of the polycystin is acting
like a matrix receptor that links the extracellular matrix to actin
cytoskeleton through focal adhesion proteins (Smith, 2002. Pp. 86). The
results also indicate that PKD and REJ homolog protein is available in
the sequence. The likely function of this protein is involvement in
fertilization. It is believed that this protein can generate calcium
ions transporting channel directly, which are involved in initiating
acrosome reaction of sperms. In addition, of immense importance is the
presence of CD97 antigen isoform 2 preproprotein the analysis protein
sequence analysis using the BLAST and PSI-BLAST indicate that the CD97
antigen isoform 2 preproprotein is present in the sequence. The likely
function of the CD97 antigen, by similarity, is acting like a receptor.
It is probably engaged in adhesion and signaling processes in the early
life after the activation of leukocyte. The protein plays an exceedingly
vital role in the migration of leukocyte.
Conclusion
The prediction of protein structure using Bioinformatics can entail
sequence similarity search, multiple sequence alignments, domain
identification and characterization, secondary structure prediction, or
constructing 3D protein structures to atomic detail. In any protein
structure prediction, there is the establishment whether a protein
sequence or a section of a protein sequence contains any structural
homologs present in a Protein Data Bank (PDB) through the use of
sequence similarity search. Experimentally, protein structures are
usually determined and grouped at the domain level (Smith, 2002. Pp.
152). Comparative molecular modeling emerges as the most accurate and
successful technique of predicting a protein structure through the use
of comparative modeling methods, it is capable to tell if a section or
all of a newly established sequence will adopt a recognized fold, which
exists in the Protein Data Bank (Smith, 2002. Pp. 187). From the protein
sequence analysis, smart00008: HormR has been identified. This domain is
usually found in a variety of hormone receptors. Another conserved
domain that has been identified is pfam12003 (PSSMID) this domain is of
an unknown function (DUF3497). It is presumed that the domain is
functionally uncharacterized and is usually found in eukaryotes. This
domain is between 213-257 amino acids in length. Besides, the domain has
a single entirely conserved residue W, which may be functionally
essential and is usually associated with pfam00002, pfam01825 and
pfam02793. The results of the protein sequence analysis indicate that
polycystin is present in the protein sequence. The likely function of
the polycystin is to act like a matrix receptor that links the
extracellular matrix to actin cytoskeleton through focal adhesion
proteins. On the other hand, CD97 antigen isoform 2 preproprotein is
present in the sequence and is identified at the domain level. The
protein sequence has 5 EGF-like domains and 1 GPS domain therefore, it
belongs to G-protein coupled receptor 2 family and LN- TM7 subfamily. By
similarity, the probable function of the CD97 antigen is acting like a
receptor it is likely engaged in adhesion and signaling processes in
the early life after the activation of leukocyte. The protein plays an
exceedingly vital role in the migration of leukocyte.
Reference List
Bishop, M. J., & Rawlings, C. J. 1997, DNA and protein sequence
analysis: A practical approach, IRL Press at Oxford University, Oxford.
Gromiha, M. M. 2010, Protein bioinformatics: From sequence to function,
Academic Press/Elsevier, Amsterdam.
Marchler-Bauer, A et al. 2009, “CDD: specific functional annotation with
the Conserved Domain Database.”, Nucleic Acids Res.37(D)205-10.
Marchler-Bauer, A et al. 2011, “CDD: a Conserved Domain Database for the
functional annotation of proteins.”, Nucleic Acids Res.39(D)225-9.
Smith, B. J. 2002, Protein sequencing protocols, Humana Press, Totowa,
N.J.
Protein Sequence by Bioinformatics Method PAGE * MERGEFORMAT 2