• 검색 결과가 없습니다.

ms œ ù m Ç " e° ‚ ÇÊ Ý T  Œ º8 ý V ê s„ Æ X N Ë| º Ä Z ØV Ä

N/A
N/A
Protected

Academic year: 2021

Share "ms œ ù m Ç " e° ‚ ÇÊ Ý T  Œ º8 ý V ê s„ Æ X N Ë| º Ä Z ØV Ä"

Copied!
7
0
0

로드 중.... (전체 텍스트 보기)

전체 글

(1)

‰

˜

ms œ ù m Ç " e° ‚ ÇÊ Ý T    Œ º8 ý V ê s„ Æ X N Ë| º Ä Z ØV Ä

T

® £]  @

Õ ü

æz  ´@ /† < Ɠ § Ò q t" î & ñ ˜ І < Æõ , " fÖ  ¦ 156-743

(2010¸   1 Z 4 6{ 9  ~ à Î6 £ §, þ j7 á xà º& ñ ‘ : r 2010¸   1 Z 4 21{ 9  ~ à Î6 £ §)

é ß

–Ñ þ ˜| 9 _   p ” ¸í ß – " f\ P – РÒ'  é ß –Ñ þ ˜| 9 _   Œ ™ " é ¶ ½ ¨› ¸\  ¦ \ V8 £ ¤ l  0 AK " f  H é ß –Ñ þ ˜| 9 _  " f\ P õ  ½ ¨› ¸ ç

ß

–_   © œ› ' a› ' a> \  ¦ € Œ • # Œ, é ß –Ñ þ ˜| 9 _  ½ ¨› ¸\  ¦   & ñ   HX < ×  æכ ¹ô  Ç % i ½ + É`  ¦   H : £ ¤f ç `  ¦ Æ ÒØ  ¦   H  כ s 

×



æכ ¹  . ‘ : r ƒ  ½ ¨\ " f  H é ß –Ñ þ ˜| 9 `  ¦ s À ҍ  H  p ” ¸í ß –`  ¦ • 2 ;à º$ í , ™ èà º$ í , ×  æ$ í _  [ j t – Ð ³ ð‰ & ³ô  Ç é ß – Ñ

þ

˜| 9  " f\ P õ , # ŒW =  t – Ð ì  rÀ Ó÷ &  H é ß –Ñ þ ˜| 9 _  s   ½ ¨› ¸ü <_   © œ› ' a› ' a> \  ¦  © œ  ñ& ñ ˜ Ð\  ¦ + ‹" f ì  r$ 3  % i 



.

Ù þ

˜d ” # Q: é ß –Ñ þ ˜| 9  " f\ P -s  ½ ¨› ¸  © œ› ' a› ' a> ,  © œ  ñ & ñ ˜ Ð, é ß –Ñ þ ˜| 9  ½ ¨› ¸, é ß –Ñ þ ˜| 9  " f\ P 

Analysis of Mutual Information Between a Protein Sequence and a Secondary Structure

Julian Lee

Department of Bioinformatics and Life Science, Soongsil University, Seoul 156-743 (Received 6 January 2010, in final form 21 January 2010)

In order to predict the three-dimensional structure of a protein from its amino-acid sequence, it is important to analyze the correlation between the sequence and the structure and to extract from the sequence features that play a crucial role in determining the protein structure. In this work, by using mutual information, we analyzed the correlation between the protein sequence and the secondary structure, where the protein sequence was expressed in terms of a reduced set of amino acids corresponding to hydrophobic, hydrophilic, and neutral ones, and the secondary structure was classified into eight classes.

PACS numbers: 87.14.Ee, 87.15.Cc, 02.50.-r

Keywords: Protein sequence-secondary structure correlation, Mutual information, Protein structure, Protein sequence

I. " e  ] Ø

é ß

–Ñ þ ˜| 9 “ É r — ¸Ž  H Ò q t" î ‰ & ³ © œ_    H" é ¶s  ÷ &  H Ó ü t| 9 s  . é ß – Ñ

þ

˜| 9 _  “ ¦Ä »  Œ ™ " é ¶ ½ ¨› ¸  H é ß –Ñ þ ˜| 9 _  l 0 p x`  ¦   & ñ l  M

:ë  H\ , é ß –Ñ þ ˜| 9 _   p ” ¸í ß – " f\ P – РÒ'  Õ ª  Œ ™ " é ¶ ½ ¨› ¸

\



¦ \ V8 £ ¤   H  כ “ É r > í ß – Ò q tÓ ü to † < Æ_  ×  æכ ¹ô  Ç ë  H] js  . Õ ª

E-mail: [email protected]



Q  " f\ P  & ñ ˜ Ðë ß –Ü ¼– Ð “ ¦Ä » ½ ¨› ¸\  ¦ · ú ˜ ? /  H  כ “ É r  f ” 



t  K   K   ½ + É õ ] j– Ð z Œ ™  e ”  .   " f é ß –Ñ þ ˜| 9 _  ² D G

™

è& h “    Œ ™ " é ¶ ½ ¨› ¸“   s   ½ ¨› ¸(secondary structure)_ 

\

V8 £ ¤\  @ /ô  Ç ƒ  ½ ¨  Ö ¸µ 1 Ï >  ”  ' Ÿ ÷ &# Q M ® o  [1–9]. : £ ¤ y

, “ ¦Ä » ½ ¨› ¸\  ¦ \ V8 £ ¤ “ ¦    H é ß –Ñ þ ˜| 9 _  " f\ P s  ½ ¨› ¸ X

<s  Z …s Û ¼\  e ”   H é ß –Ñ þ ˜| 9 õ  „  ^ ‰& h “   " f\ P  Ä » $ í s 

\ O

  H  â Ä º, ² D G™ è& h “   ½ ¨› ¸ë ß – X <s  Z …s Û ¼– РÒ'  Æ ÒØ  ¦

“ ¦, „  ^ ‰& h “   ½ ¨› ¸  H  Ä » \  -t  † < Êà º\  ¦ þ j™ è o # Œ

-126-

(2)

½

¨   H s  É r  › ¸y Œ • ´ ú Æ Òl  ~ ½ ÓZ O  [10–19]s  $ í ' Ÿ  “ ¦ e ”  l

 M :ë  H\ , s   ½ ¨› ¸ü < ° ú  “ É r ² D G™ è& h “   ½ ¨› ¸_  \ V8 £ ¤“ É r  8



8¹ ¡ ¤ ×  æכ ¹ô  Ç ë  H] j– Ð Â Òy Œ •÷ &“ ¦ e ”  .

$ í

/ B N& h “   s   ½ ¨› ¸ \ V8 £ ¤`  ¦ 0 AK " f  H " f\ P – РÒ'  s  

½

¨› ¸   & ñ \  ×  æכ ¹ô  Ç % i ½ + É`  ¦   H ² D G™ è& h “   : £ ¤f ç `  ¦ Æ ÒØ  ¦

  H  כ s  B Ä º ×  æכ ¹  . s  Qô  Ç : £ ¤f ç ×  æ ×  æכ ¹ô  Ç  כ Ü ¼– Ð

"

f ™ èà º$ í (hydrophobicity)õ  • 2 ;à º$ í (hydrophilicity)s  e

”

 . é ß –Ñ þ ˜| 9 s  Ó ü t 5 Å q\ " f “ ¦Ä »ô  Ç ½ ¨› ¸– Ð ] X  n = M :, • 2 ; Ã

º$ í  p ” ¸í ß –“ É r é ß –Ñ þ ˜| 9 _  5 Å qÜ ¼– Ð [ þ t# Q 9 “ ¦ • 2 ;à º

$ í

 p ” ¸í ß –“ É r é ß –Ñ þ ˜| 9 _   ¾ ú  Ü ¼– Ð  š ¸ 9“ ¦ Ù ¼– Ð, • 2 ; Ã

º$ í õ  ™ èà º$ í “ É r “ ¦Ä » ½ ¨› ¸\  ¦   & ñ   HX < &  ê ø Í % ò † ¾ Ó

`



¦ p • 2 ; .   " f  p ” ¸í ß –`  ¦ • 2 ;à º$ í õ  ™ èà º$ í _  ¿ º  t

– Ðë ß –, < ʓ É r ×  æ$ í  t  V , # Q" f [ j t – Ð ì  rÀ Óô  Ç ç ß –é ß –ô  Ç

—

¸+ þ A`  ¦ : Ÿ xK  é ß –Ñ þ ˜| 9  ½ ¨› ¸_  $ í | 9 `  ¦ ƒ  ½ ¨ô  Ç  7 Hë  H[ þ t• ¸   Ã

º ” > rF ô  Ç  [20–36].

‘ :

r  7 Hë  H\ " f  H, é ß –Ñ þ ˜| 9 _   p ” ¸í ß –`  ¦ • 2 ;à º$ í , ™ èà º$ í ,

×



æ$ í _  [ j t – Ð ³ ð‰ & ³ô  Ç " f\ P õ  é ß –Ñ þ ˜| 9 _  s   ½ ¨› ¸ü <

_

  © œ› ' a › ' a> \  ¦  © œ  ñ& ñ ˜ Ð   H ' ‘ • ¸\  ¦ + ‹" f › ¸  # Œ ˜ Ð

€

Œ

¤ .

II. U ê s 0 n É

1. ‰ ˜ ms œ ù m Ç  Œ ºô p §8 ý ù m ɶ  ¥

z



´+ « >& h Ü ¼– Ð   & ñ ô  Ç é ß –Ñ þ ˜| 9  ½ ¨› ¸[ þ ts  | 9 @ /$ í ÷ &# Q e ” 





H / B M“ É r Protein Data Bank(PDB)(http://www.rcsb.org/

pdb/home/home.do)– Ð" f, 50000> h\  ¹ ¢ ¤~ Ã Ì   H é ß –Ñ þ ˜| 9 

½

¨› ¸[ þ ts  $  © œ÷ &# Q e ”   H X <s  Z …s Û ¼s t ë ß –, ×  æ4 Ÿ ¤÷ &





H ½ ¨› ¸ü < ± ú “ É r K  © œ• ¸_  ½ ¨› ¸[ þ ts  B Ä º ´ ú § . Õ ªX O  l

 M :ë  H\  Structural Classification of Proteins(SCOP)

“ É

r s [ þ t ½ ¨› ¸[ þ t`  ¦ 44000> h_  • ¸B j“  (domain)Ü ¼– Ð ì  r À

ÓK  Z  ~€ Œ ¤Ü ¼ 9 [37], ASTRAL compendium [38]“ É r



r  s [ þ t`  ¦ # Œõ  # Œ " f\ P s  q 5 p wô  Ç ½ ¨› ¸, K  © œ• ¸

 ± ú “ É r ½ ¨› ¸ 1 p x`  ¦ ] j ô  Ç | 9 ½ + Ë`  ¦ ë ß –[ þ t# Q Z  ~€ Œ ¤ .

‘ :

r  7 Hë  H\ " f  H ASTRAL SCOP (version 1.63)_  é ß – Ñ

þ

˜| 9 [ þ t`  ¦  r  BLASTCLUST (NCBI BLAST 2.2.5, http://www.ncbi.nlm.nih.gov/BLAST/)\  ¦  6   x # Œ Á º



8l f ± “ ¦ y Œ • Á º 8l \ " f @ /³ ð é ß –Ñ þ ˜| 9 `  ¦ i ( v6 £ §Ü ¼– Ð+ ‹, é ß – Ñ

þ

˜| 9  ç ß –_  " f\ P  Ä » $ í s  25% p ë ß –s  ÷ &• ¸2 Ÿ ¤ % i  . s  X

O

>  ë ß –Ž  H é ß –Ñ þ ˜| 9  | 9 ½ + Ë\   H þ j7 á x& h Ü ¼– Ð 8 ú x 921195> h_ 



p ” ¸í ß –Ü ¼– Ð s À Ò# Q”   4362> h_  é ß –Ñ þ ˜| 9 s  [ þ t# Q e ”  .

2. ‰ ˜ ms œ ù m Ç " e° ‚ Ç8 ý Ä Z ؒ ½

é ß

–Ñ þ ˜| 9 `  ¦ s À ҍ  H  p ” ¸í ß – 207 á xÀ Ó\  ¦ • 2 ;à º$ í (H), ™ èà º

$ í

(P), ×  æ$ í (N)_  [ j t – Ðë ß – ì  rÀ Ó % i  . [ j 2 ;(Serine), à

ÔY Uš ¸  (Threonine),  Û ¼ Ø Ôà Ôí ß –(Aspartic acid),   Û

¼  |  (Asparagine), / å JÀ ҄ à Ðí ß –(Glutamic acid), /

å

JÀ Ò   (Glutamine), y Û ¼w Ž  (Histidine),  Ø Ôl 





(Arginine), o ’  (Lysine), w – В  (Tyrosine), r Û ¼_ …

“



(Cysteine) (11 7 á xÀ Ó)“ É r P– Ð, À Ғ  (Leucine), s ™ èÀ Ò

’



(Isoleucine), µ 1 Ϗ 2 ;(Valine), B jw š ¸  (Methionine),

`

…u  ´· ú ˜   (Phenylalanine), à Ôw n ž Ðó ø Í(Tryptophan)(6 7

á

xÀ Ó)“ É r H, · ú ˜   (Alanine),á Ô\  ¦ ;(Proline),/ 2 å Jo 

’



(Glycine)(37 á xÀ Ó)“ É r NÜ ¼– Ð ì  rÀ Ó % i  . ì  r$ 3  @ / © œ

“



 é ß –Ñ þ ˜| 9 [ þ t`  ¦ s À ҍ  H 921195> h_   p ” ¸í ß –“ É r P

469422> h, H 269049, Ns  182724> hs  .

3. ‰ ˜ ms œ ù m Ç T    Œ º8 ý Ä Z ؒ ½

é ß

–Ñ þ ˜| 9 _  s   ½ ¨› ¸  H  Œ ™ " é ¶ ý a³ ð– РÒ'  Definition of Secondary Structure of Proteins(DSSP) [39]\    

>

í ß – ) a . DSSP\ " f & ñ _    H s   ½ ¨› ¸  H & ñ ½ ©  ‚  

½

¨› ¸(Regular Helix), π  ‚   ½ ¨› ¸(π Helix), 3

10

 ‚   ½ ¨

›

¸(3

10

Helix),` ˆ 5 g”   ½ ¨› ¸(Extended), β- o (β-bridge), [



tl (Turn), … ô a(Bend),  ï{ 9 (Coil)_  8t s  . ì  r$ 3  @ /



©

œ“   921195> h_   p ” ¸í ß –[ þ t`  ¦ s  ½ ¨› ¸– Ð ì  rÀ Ó €  , 0 A í



H" f@ /– Ð y Œ •y Œ • 300466, 247, 33369, 187075,11032,105476, 88737, 194793> hs  .

4. " e° ‚ ÇÊ Ý T    Œ º ‡ ˜ m8 ý V ê s„ ÆX N Ë| º

@

/ Òì  r_  s   ½ ¨› ¸ \ V8 £ ¤ · ú ˜“ ¦o 7 £ §“ É r é ß –Ñ þ ˜| 9 _  " f\ P 

`



¦ { 9 & ñ ô  Ç U  ´s _  ‚ ½ Óë  HÜ ¼– Ð   É r Ê ê Õ ª " f\ P  : £ ¤f ç Ü ¼– ÐÂ Ò '

 ‚ ½ Óë  H_  î  rX <\  e ”   H " f\ P _  s   ½ ¨› ¸\  ¦ \ V8 £ ¤ô  Ç .



 " f s   7 Hë  H\ " f  H U  ´s  N = 1, · · · , 5“   ‚ ½ Óë  H î ß –\  [ þ t

#

Qš ¸  H " f\ P õ , ‚ ½ Óë  H_  ×  æd ” Ü ¼– РÒ'  dë ß – p u b  # Q”   0 Au  _

 s   ½ ¨› ¸ü <_   © œ› ' a› ' a> \  ¦ ì  r$ 3  % i  . d  H C-= å Q Qo 

~

½

ӆ ¾ Ó`  ¦ +, N -= å Q Qo  ~ ½ ӆ ¾ Ó`  ¦ − ~ ½ ӆ ¾ ÓÜ ¼– Ð Z  ~“ ¦, ‚ ½ Óë  H_  U  ´ s

 f . Ëà º{ 9   â Ä º  ± p î  rX < 0 Au \  ¦ d = 0, ‹ Œ •à º U  ´s _ 

 â

Ä º  H  ± p î  rX <\  e ”   H ¿ º > h_  0 Au \  ¦ d = ±0.5– Ð ¸ ú š

€

Œ

¤ (Fig. 1(a)). N = 1“    â Ä º_  " f\ P õ  " f\ P , ½ ¨› ¸ü <

½

¨› ¸_   © œ› ' a › ' a> • ¸ ¶ ú ˜( R˜ Ѐ Œ ¤ .

›

¸y Œ • ´ ú Æ Òl \  ¦ s 6   xô  Ç  Œ ™ " é ¶ ½ ¨› ¸ \ V8 £ ¤ ~ ½ ÓZ O  [10–

19]\ " f  H " f\ P  › ¸y Œ •\  K { © œ÷ &  H ² D G™ è& h “   ½ ¨› ¸\  ¦ : Ÿ xP :

(3)

H H P P H H P H P

N

d

H

(a)

H H P P H H P H P

N

C C H H H H C C C

(b)

Fig. 1. (a) Computation of mutual information between a sequence segment of length of N and the secondary structure at the residue position d. (b) Computation of mutual information between the sequence segment and the secondary structure segment of length N.

–

Ð 4 Rš ¸l  M :ë  H\ , › ¸y Œ • î  rX <_  0 Au  ÷  rë ß –  m   › ¸ y

Œ

• „  ^ ‰_  s   ½ ¨› ¸   ×  æכ ¹ô  ÇX <, s \  ¦ % i ¿ º\  ¿ º“ ¦ U



´s  N = 1, · · · , 5“   ‚ ½ Óë  H î ß –\  [ þ t# Qš ¸  H " f\ P  J ‡  õ  s 

 ½ ¨› ¸_   © œ› ' a › ' a> • ¸ ì  r$ 3  % i  (Fig. 1(b)).

"

f\ P õ  ½ ¨› ¸ü <_   © œ› ' a› ' a>   H  © œ  ñ& ñ ˜ Ð(mutual infor- mation)   H ' ‘ • ¸\  ¦ s 6   x # Œ 8 £ ¤& ñ % i  . " f\ P  J ‡   < ½ ¨› ¸ J ‡   Y  : £ ¤& ñ ° ú כ xü < y\  ¦ | 9  S X ‰Ò  ¦`  ¦ y Œ •l  P (x), P (y)  “ ¦, s  Qô  Ç  | s  1 l xr \  { 9 # Q   H S X ‰ Ò



¦\  ¦ P (x, y)  ½ + É M :,  © œ  ñ& ñ ˜ Ѝ  H

I(X, Y ) ≡ X

x,y

P (x, y) log

2

P (x, y)

P (x)P (y) (1)

–

Ð & ñ _ ÷ & 9, é ß –0 A  H q à Ô(bit)s  .  © œ  ñ& ñ ˜ Ѝ  H X\  ¦

· ú

˜ M : Y \  @ /K  % 3   H ¨ î ç  H & ñ ˜ Ð| ¾ Ós  “ ¦ K $ 3 ½ + É Ã º e ” 



.  © œ› ' a› ' a>  \ O `  ¦ M :\   H P (x, y) = P (x)P (y) s Ù ¼

–

Ð I(X, Y ) = 0s  . S X ‰Ò  ¦ P (x), P (y), P (x, y)[ þ t“ É r X <

s

 \ " f  | [ þ ts  z  ´] j– Ð { 9 # Qè ß – ‘  • ¸[ þ t– РÒ'  Æ Ò& ñ ô  Ç



. " f\ P _  U  ´s \  ¦ N s   ½ + É M : d ” (1)\ " f x\  @ /ô  Ç ½ + ˓ É r 3

N

> h_  † ½ Ó`  ¦  8   H  כ s “ ¦ y  H ô  Ç 0 Au _  s   ½ ¨› ¸\  ¦ Ò

q

ty Œ •   H  â Ä º 8> h(Fig. 1(a)), ‚ ½ Óë  H î ß –\  [ þ t# Qš ¸  H s  

½

¨› ¸ „  ^ ‰\  ¦ Ò q ty Œ •   H  â Ä º 8

N

> h  ) a (Fig. 1(b)).

° ú

 “ É r € ª œz o _   © œ  ñ& ñ ˜ Ѝ  H  6 £ §õ  ° ú  s  ³ ð‰ & ³ ) a .

I(X, X) = − X

x

P (x) log

2

P (x) (2)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14

-6 -4 -2 0 2 4 6

1 2 3 4 5

d

Fig. 2. Mutual information between a sequence segment of length of N (=1...,5) and the secondary structure at the residue position d (= -5,...,5).

s

 € ª œ`  ¦ D h}   ' pà Ԗ Ðx (Shannon entropy)  Â ÒØ Ô 9,

"

f\ P (< ʓ É r s   ½ ¨› ¸)\  ¦ 8 £ ¤& ñ Ù þ ¡`  ¦ M : Õ ª  ^ ‰\  @ /K  % 3 

>

 ÷ &  H ¨ î ç  H & ñ ˜ Ð| ¾ Ós  . \ V\  ¦ [ þ t# Q Z þ t ô  Ç t  J ‡  ë ß –



 è ß – €   P (x)  H 1 < ʓ É r 0ë ß –_  ° ú כ`  ¦ ° ú >  ÷ &# Q D h} 

 '

pà Ԗ Ðx   H 0s  9, s   H J ‡  `  ¦ › ' a8 £ ¤Ù þ ¡`  ¦ M : % 3   H D h– Ð î



r & ñ ˜ Ð  Á º כ • ¸ \ O    H _ p – Ð K $ 3 ½ + É Ã º e ”  . X

¿

º t  ° ú כ`  ¦ 0.5_  S X ‰Ò  ¦– Ð | 9  M : D h}   ' pà Ԗ Ðx   H 1 q

à Ôs  .

III. + s Ç Ê Ý

‚

½

Óë  H U  ´s  N = 1, · · · , 5– Ð   É r " f\ P õ  0 Au  d =

−5, · · · , 5_  s   ½ ¨› ¸ü <_   © œ  ñ& ñ ˜ Ð Table 1õ  Fig.

2\      e ”  . N = 1 “    â Ä º_   o  d ë ß – p u b  # Q”  

"

f\ P   s , Õ ªo “ ¦ ½ ¨› ¸  s _   © œ  ñ& ñ ˜ Ð Table 1_    t

} Œ • ¿ º \ P õ  Fig. 3\      e ”  . " f\ P   s ü < ½ ¨› ¸



s _   © œ  ñ& ñ ˜ Ѝ  H { © œƒ  y  d\  @ / # Œ @ /g As Ù ¼– Ð Fig.

3“ É r d ≥ 0\  @ / # Œë ß –   ? /% 3  .

„



^ ‰& h Ü ¼– Ð " f\ P  J ‡  _  î  rX <  Òì  rÜ ¼– Ð `  ¦Ã º2 Ÿ ¤  © œ  

ñ& ñ ˜ Ð_  ° ú כs  & t   H  כ `  ¦ ^  ¦ à º e ”   HX <,  © œ  o  Â Ò ì



rÜ ¼– Ð ° ú ˜Ã º2 Ÿ ¤ ‚ ½ Óë  H  ¾ ú  A á ¤\  0 Au ô  Ç " f\ P _  % ò † ¾ Ó`  ¦ ~ à Î

>

 | ¨ c  כ s Ù ¼– Ð, s  Qô  Ç   õ   H p o  f ”  Œ •½ + É Ã º e ”  .

(4)

Table 1. Mutual information between a sequence segment of length of N and the secondary structure at the residue position d (= -5,...,5), in the unit of bits. The last two columns represent mutual information between sequence segments, and between structure segments, for N = 1.

HH d HH H

N 1 2 3 4 5 sequence-sequence structure-structure

-5 0.002452 - 0.010235 - 0.024250 0.001208 0.187318

-4.5 - 0.005251 - 0.017597 - - -

-4 0.002248 - 0.011519 - 0.033869 0.000530 0.264281

-3.5 - 0.005308 - 0.025147 - - -

-3 0.002465 - 0.017836 - 0.060447 0.000398 0.386820

-2.5 - 0.010538 - 0.050075 - - -

-2 0.007265 - 0.041490 - 0.104618 0.001992 0.605520

-1.5 - 0.031491 - 0.091824 - - -

-1 0.022681 - 0.079683 - 0.130419 0.000586 1.056072

-0.5 - 0.062145 - 0.113482 - - -

0 0.035734 - 0.092027 - 0.135136 1.477159 2.404460

0.5 - 0.058062 - 0.108445 - - -

1 0.019074 - 0.071226 - 0.122001 0.000586 1.056072

1.5 - 0.026272 - 0.079697 - - -

2 0.006131 - 0.032913 - 0.089418 0.001992 0.605520

2.5 - 0.008371 - 0.039427 - - -

3 0.001448 - 0.013495 - 0.047788 0.000398 0.386820

3.5 - 0.003958 - 0.019392 - - -

4 0.002040 - 0.008631 - 0.026369 0.000530 0.264281

4.5 - 0.005165 - 0.013482 - - -

5 0.002702 - 0.008954 - 0.018690 0.001208 0.187318

0 0.5 1 1.5 2 2.5

0 1 2 3 4 5

sequence structure

d

Fig. 3. Mutual information between sequence segments of length 1, and between structure segments of length 1, for separation d (= 0,...,5).

"

f\ P   s ü < ½ ¨› ¸  s _   © œ  ñ& ñ ˜ Ð\ " f d = 0“    â Ä º

 y Œ •l  " f\ P õ  s   ½ ¨› ¸_  D h}   ' pà Ԗ Ðx s  . ô  Ç 0 Au  _

 " f\ P _  D h}   ' pà Ԗ Ðx  1.48 q à Ԗ Ð" f log

2

3 ' 1.58 q

à Ԙ Ð  €  •ç ß – & h “ É r  כ “ É r H,P,N_  q Ö  ¦s  ° ú  t  · ú §l  M : ë



Hs  . ô  Ç 0 Au _  s  ½ ¨› ¸_  D h}   ' pà Ԗ Ðx • ¸ 2.40 q  à

Ԗ Ð" f % i r  ° ú  “ É r s Ä »– Ð þ j@ / 0 p xô  Ç ° ú כ“   log

2

= 3 q  à

Ԙ Ð   Œ • . Table 1õ  Fig. 3\ " f ˜ Ð1 p w, " f\ P   s _   © œ  

ñ& ñ ˜ Ѝ  H d = 1{ 9  M : s p  €  • 0.0006 q à Ԗ Ð" f d = 0{ 9  M

:\  q K   s `›   & h Ü ¼Ù ¼– Ð, s Ö  ©   H " f\ P _  • 2 ;à º$ í õ  ™ è Ã

º$ í “ É r  _  1 l qw n & h s  “ ¦ ^  ¦ à º e ”  . ì ø ̀  \  s   ½ ¨

›

¸_   â Ä º  H d > 0\  @ /K " f  © œ{ © œô  Ç € ª œ_   © œ  ñ& ñ ˜ Ð ° ú כ

`



¦ ° ú   H  כ `  ¦ ^  ¦ à º e ”   HX <, s   H s   ½ ¨› ¸ " f\ P   © œ_  ô



Ç 0 Au \ " f “ ¦w n ÷ &# Q      H  כ s   m   # QÖ ¼ & ñ • ¸ _

 U  ´s \  ¦ ° ú   H ƒ  5 Å q& h “   ½ ¨› ¸– Ð" f      H  כ s Ù ¼– Ð, p

o  l @ /½ + É Ã º e ”   H   õ s  .

U



´s  1, · · · , 5“   " f\ P õ  ô  Ç 0 Au _  s   ½ ¨› ¸ü <_   © œ  ñ

&

ñ

˜ Ѝ  H Õ ª þ j@ /° ú כs  0.1351 q à Ô (N = 5, d = 0)– Ð" f, ô  Ç 0

Au _  s   ½ ¨› ¸_  D h}   ' pà Ԗ Ðx “   2.04 q à Ԙ Ð   s `

›



 & h “ É r € ª œ(€  • 6.6%)s  .   " f U  ´s  5“   › ¸y Œ • " f\ P _ 

• 2

;à º$ í -™ èà º$ í -×  æ$ í J ‡  ë ß –Ü ¼– Ð î  rX < 0 Au _  s   ½ ¨

›

¸\  ¦ \ V8 £ ¤   H  כ “ É r Á ºo    H    : r`  ¦ ? /w n = à º e ”  .

(5)

Table 2. Shannon entropy of the sequence segments and the structure segments, and their mutual information, of length N.

window length (N ) 1 2 3 4 5

sequence-sequence 1.477159 2.953536 4.428144 5.902314 7.375734

structure-structure 2.404460 3.754270 5.019555 6.246476 7.457867

sequence-structure 0.035734 0.097584 0.193236 0.339890 0.665515

1 2 3 4 5 6 7 8

0 1 2 3 4 5 6

sequence structure

Fig. 4. Shannon entropy of the sequence and the struc- N

ture segments of length N.

:

£

¤& ñ 0 Au _  s   ½ ¨› ¸ @ /’   U  ´s  N “   ‚ ½ Óë  H î ß –\  [ þ t# Q

š

¸  H  Òì  r_  s   ½ ¨› ¸ „  ^ ‰ü < " f\ P õ _   © œ  ñ& ñ ˜ Ð\  ¦ ½ ¨

€   Õ ª ° ú כ“ É r Table 2ü < Fig. 5ü < ° ú   . " f\ P   s , Õ ªo 

“

¦ s   ½ ¨› ¸  s _   © œ  ñ& ñ ˜ Ð, 7 £ ¤ D h}   ' pà Ԗ Ðx • ¸ ½ ¨

# Œ Table 2ü < Fig. 4\    ? /% 3  . N = 1 “    â Ä º_ 

 '

pà Ԗ Ðx  ° ú כ[ þ t“ É r · ú ¡" f ½ ¨ô  Ç " f\ P   s ü < ½ ¨› ¸  s _   © œ  

ñ& ñ ˜ Ð\ " f d = 0“    â Ä º\  K { © œ ) a .

"

f\ P _  D h}   ' pà Ԗ Ðx   H U  ´s \   _  q Y V   H  כ `  ¦

· ú

˜ à º e ”   HX <, ô  Ç 0 Au _  • 2 ;à º$ í -™ èà º$ í s  s Ö  © 0 Au _ 

• 2

;à º$ í -™ èà º$ í õ   _  1 l qw n s    H  כ `  ¦  r  ô  ǁ   ˜ Ð# Œ ï



r . · ú ¡" f ˜ Ѐ Œ ¤1 p ws  s   ½ ¨› ¸  H s Ö  ©z o _   © œ› ' a› ' a> 

\



¦ Á ºr ½ + É Ã º \ O Ü ¼Ù ¼– Ð, ½ ¨› ¸ J ‡  _  D h}   ' pà Ԗ Ðx   H

@

/| Ä Ì ‚  + þ A † < Êà º– Ð ³ ð‰ & ³÷ &l   H t ë ß – U  ´s \  q Y V t 





H · ú §  H . Fig. 5\ " f " f\ P  J ‡  õ  ½ ¨› ¸ J ‡   ç ß –_   © œ  ñ

&

ñ

˜ Ѝ  H U  ´s  7 £ x† < Ê\     { 9   † < Êà º˜ Ð   Ø Ô>  7 £ x

ô  Ç   H  כ `  ¦ ^  ¦ à º e ” Ü ¼ 9, U  ´s  5“    â Ä º\  Õ ª ° ú כs  0.666 q à Ԗ Ð" f s   ½ ¨› ¸ J ‡  _  D h}   ' pà Ԗ Ðx  7.46 q  à

Ô_  @ /| Ä Ì €  • 8.9% & ñ • ¸s  .

IV. + s Ç Â ] Ø

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

0 1 2 3 4 5 6

Fig. 5. Mutual information between the sequence and N

the structure segments of length N.

‘ :

r  7 Hë  H\ " f  H, é ß –Ñ þ ˜| 9 _   p ” ¸í ß –`  ¦ • 2 ;à º$ í , ™ èà º$ í ,

×



æ$ í _  [ j t – Ð ì  rÀ Ó # Œ ³ ð‰ & ³ô  Ç " f\ P õ  # ŒW =  t – Ð ì



rÀ ӝ ) a é ß –Ñ þ ˜| 9 _  s   ½ ¨› ¸ ç ß –_   © œ› ' a › ' a> \  ¦  © œ  ñ& ñ ˜ Ð



  H ' ‘ • ¸\  ¦ + ‹" f › ¸  # Œ ˜ Ѐ Œ ¤ . : £ ¤y ,  o  d ë ß – p u b  

#

Q4 R e ”   H " f\ P   s ü < ½ ¨› ¸  s _   © œ  ñ& ñ ˜ Ð, U  ´s  N “  

"

f\ P  õ  Õ ª ×  æd ” Ü ¼– РÒ' _  0 Au  d“   s   ½ ¨› ¸ü <_ 



©

œ  ñ& ñ ˜ Ð, Õ ªo “ ¦ U  ´s  N “   " f\ P õ  s   ½ ¨› ¸_  y Œ •y Œ •_  D

h}   ' pà Ԗ Ðx  x 9  © œ  ñ& ñ ˜ Ð\  ¦ ½ ¨ % i  . s – РÒ'  " f\ P 

“ É

r y Œ • 0 Au  s Ö  © 0 Au ü <  z  ´  © œ 1 l qw n “   ì ø ̀  , s   ½ ¨

›

¸  H l @ / % i 1 p ws  s Ö  © 0 Au z o _   © œ› ' a› ' a>   © œ{ © œy  e

”

6 £ §`  ¦ ˜ Ѐ Œ ¤ . " f\ P õ  s   ½ ¨› ¸_   © œ  ñ& ñ ˜ Ѝ  H ô  Ç 0 Au  _

 s   ½ ¨› ¸\  ¦ “ ¦ 9   H  â Ä º  H s   ½ ¨› ¸_  D h}   ' pà Ô

–

Ðx \  q K   s `›   & h “ É r € ª œe ” `  ¦ ^  ¦ à º e ” % 3 Ü ¼ 9, s   H " f

\ P

_  • 2 ;à º$ í -™ èà º$ í -×  æ$ í J ‡  ë ß –Ü ¼– Ð s   ½ ¨› ¸\  ¦ \ V8 £ ¤

  H X <\   H Á ºo  e ” 6 £ §`  ¦ ˜ Ð# Œï  r . ì ø ̀  \  ‚ ½ Óë  H U  ´s  5_   â Ä º s  î ß –\  [ þ t# Qš ¸  H s   ½ ¨› ¸ „  ^ ‰ü <_   © œ  ñ& ñ

˜

Ð s  ½ ¨› ¸ D h}   ' pà Ԗ Ðx _  @ /| Ä Ì 10%\  ¹ ¢ ¤~ à Ìô  Ç   H

 כ

`  ¦ ^  ¦ à º e ” % 3  . s   ½ ¨› ¸\  ¦ \ V8 £ ¤   H ´ ú §“ É r · ú ˜“ ¦o  7

£

§[ þ t [1–9]s  " f\ P _   ×  æ & ñ § > =`  ¦ : Ÿ xK  % 3   H s  É r  " f\ P 

\ O

Ï ã J(sequence profile)`  ¦ J ‡   d ” Z >  · ú ˜“ ¦o 7 £ §\  { 9 § 4  

(6)

#

Œ \ V8 £ ¤`  ¦ à º' Ÿ  “ ¦ e ” Ü ¼ 9, a % ~“ É r   õ \  ¦ % 3 “ ¦ e ”  . " f

\ P

 \ O Ï ã J\   p ” ¸í ß –_  • 2 ;à º$ í _  & ñ • ¸ s ü @\  # Q* ‹ô  Ç ×  æ כ

¹ô  Ç & ñ ˜ Ð { Œ ™ e ”   Ht  ^ ‰> & h Ü ¼– Ð ¶ ú ˜( R˜ Ѝ  H  כ “ É r · ú ¡ Ü

¼– Ð  8 ”  ' Ÿ ÷ &# Q  ½ + É B Ä º < É ªp e ”   H ƒ  ½ ¨ õ ] j “ ¦ Ò q t y

Œ

• ) a .

P c

p 8 ý ò k >

s

  7 Hë  H“ É r 2006¸  • ¸ & ñ  ÒF " é ¶(“ §¹ ¢ ¤“  & h  " é ¶Â Ò † < ÆÕ ü tƒ  

½

¨› ¸$ í  \ O q )Ü ¼– Ð ô  Dz D G† < ÆÕ ü t”  < É ªF é ß –_  t " é ¶`  ¦ ~ à Î  ƒ  

½

¨÷ &% 3 6 £ §(KRF-2006-003-C00133)

Y c

p w Š à U Ø ”  ô

[1] B. Rost, C. Sander, J. Mol. Biol. 232 (1993) 584.

[2] D. Jones, J. Mol. Biol. 292 (1999) 195.

[3] M. Ouali and R. King, Protein Science. 9 (1999) 1162.

[4] R. Adamczak, A. Porollo and J. Meller, Proteins.

59 (2005) 467.

[5] S. Hua and Z. Sun, J. Mol. Biol. 308 (2001) 397.

[6] K. Kim and H. Park, Protein Eng. 16 (2003) 553.

[7] K. Joo, Julian Lee, S.-Y. Kim, I. Kim and S. J. Lee, J. Lee, J. Korean Phys. Soc. 44 (2004) 599.

[8] K. Joo, I. Kim, Julian Lee, S.-Y. Kim and S. J. Lee, J. Lee, J. Korean Phys. Soc. 45 (2004) 1441.

[9] G. Pollastri and A. McLysaght, Bioinformatics 21 (2004) 1719.

[10] D. Baker and A. Sali, Science 294, 93 (2001) [11] A. M. Lest et al., Proteins. 45 (S5), 98 (2001); P.

Aloy et al., Proteins. 53, 436 (2003); J. J. Vincent et al., Proteins. 61 (S7), 67 (2005).

[12] K. T. Simons, C. Kooperberg, E. Huang and D.

Baker, J. Mol. Biol. 268, 209 (1997); C. Rohl, C.

Strauss, K. Misura and D. Baker, Methods Enzy- mol. 383, 66 (2004).

[13] D. T. Jones, Proteins 45 (S5), 127 (2001); Proteins 61 (S7), 143 (2005).

[14] G. Chikenji, Y. Fujitsuka and S. Takada, J. Chem.

Phys. 119, 6895 (2003); Y. Fujitsuka, G. Chikenji and S. Takada, Proteins 62, 381 (2006).

[15] G. Chikenji, Y. Fujitsuka and S. Takada, Proc. Natl.

Acad. Sci. U.S.A. 103, 3141 (2006).

[16] J. Lee, S-Y. Kim, K. Joo, I. Kim and J. Lee, Proteins 56, 704 (2004).

[17] J. Lee, S-Y. Kim and J. Lee, Biophys. Chem. 115, 209 (2005).

[18] J. Lee, S-Y. Kim and J. Lee, J. Korean Phys. Soc.

46, 707 (2005).

[19] S-Y. Kim, W. Lee and J. Lee, J. Chem. Phys. 125, 194908 (2006).

[20] K. A. Dill, Biochemistry 24, 1501 (1985).

[21] K. F. Lau and K. A. Dill, Macromolecules 22, 3986 (1989).

[22] C. J. Camacho and D. Thirumalai, Phys. Rev. Lett.

71, 2505 (1993).

[23] H. S. Chan and K. A. Dill, J. Chem. Phys. 100, 9238 (1994).

[24] H. Li, R. Helling and C. Tang, Science 273, 666 (1996); C. Tang, Physica A 288, 31 (2000).

[25] R. M’elin, H. Li, N. S. Wingreen and C. Tang, J.

Chem. Phys. 110, 1252 (1999).

[26] T. Wang, J. Miller, N. S. Wingreen, C. Tang and K.

A. Dill, J. Chem. Phys. 113, 8329 (2000).

[27] N. D. Socci and J. N. Onuchic, J. Chem. Phys. 101, 1519 (1994).

[28] N. D. Socci and J. N. Onuchic, J. Chem. Phys. 103, 4732 (1995).

[29] J. D. Honeycut and D. Thirumalai, Proc. Natl.

Acad. Sci. U. S. A. 87, 3526 (1990).

[30] J. D. Honeycut and D. Thirumalai, Biopolymers 32, 695 (1992).

[31] F. H. Stillinger, T. Head-Gordon and C. L. Hirsh- feld, Phys. Rev. E 48, 1469 (1993).

[32] F. H. Stillinger and T. Head-Gordon, Phys. Rev. E

52, 2872 (1995).

(7)

[33] S-Y. Kim, S. J. Lee and J. Lee, J. Chem. Phys. 119, 10274 (2003).

[34] S-Y. Kim, S. J. Lee and J. Lee, J. Korean Phys. Soc.

44, 589 (2004).

[35] J. Lee, J. Korean Phys. Soc. 45, 1450 (2004).

[36] S.-Y. Kim, S. B. Lee and J. Lee, Phys. Rev. E 72, 011916 (2005).

[37] A. G. Murzin, S. E. Brenner, T. Hubbard and C.

Chothia, J. Mol. Biol. 247, 247 (1995); A. Andreeva, D. Howorth, S. E. Brenner, T. J. Chothia and A. G.

Murzin, Nucleic Acids Res. 32, 226 (2004).

[38] J. M. Chadonia, G. Hon, N. S. Walker, L. Lo Conte, P. Koehl, M. Levitt and S. E. Brenner, Nucleic Acids Res. 32, 189 (2004).

[39] W. Kabsch and C. Sander, Biopolymers 22, 2577

(1983).

수치

Fig. 1. (a) Computation of mutual information between a sequence segment of length of N and the secondary structure at the residue position d
Table 1. Mutual information between a sequence segment of length of N and the secondary structure at the residue position d (= -5,...,5), in the unit of bits
Table 2. Shannon entropy of the sequence segments and the structure segments, and their mutual information, of length N

참조

관련 문서

In addition to the problem of this bias, the problem caused by using the time variable with a weighted wage rate may be a multicollinearity between time value (=time x

 äM EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE F s  Ċ äM ¾Œ 

The purpose of this study was to investigate the appropriateness of the learning sequence of Science(area of matters) and Chemistry contents and

- the difference between the energy required to charge a secondary battery and the energy delivered by the battery in use (q wh = q Ah x V discharge /V charge ).

We study the relationship between Independent variables such as the V/T(Vibration Time), V/T movement, expansion height, curing time, placing temperature, Rising and C/S ratio

co-treatment with hispidulin and TGF-β up-regulated the protein of expression E-cadherin and occludin against TGF-β-induced in MCF-7 and HCC38 cells.. The

We determined the nucleotide sequences of the mitochondrial DNA (mtDNA) control region using cloning and sequencing, and obtained the complete sequence from the cattle bones

As a result of pyrG N-PCR sequence analysis, the positive rate of Borrelia in 27 wild mice captured in October and November was 29.63% (8 positive / 27 total) and only