• 검색 결과가 없습니다.

B. 유전자 공동발현 연관성 측정법 비교결과

Ⅳ. 고찰

본 연구는 마이크로어레이 데이터를 이용한 조건 특이적인 PPIs 예측정도를 확인한 것으로, 이를 위해서 첫째, 조건 특이적인 PPIs 데이터를 구축하고 둘째, 다양한 마이크로어레이 전처리 방법과 유전자 공동발현 측정법을 포괄적으로 비교분석하였다.

실제, ESC 를 대상으로 조건 특이적인 PPIs 사이의 유전자 공동발현 정도가 General PPIs, random pairs 및 NIPs 보다 가장 높게 나왔으며, 이 결과는 Random 과 General PPIs 와 비교했을 시 유의한 차이가 있음을 보여주었다. 특히, ESC 를 Random 과 비교했을 시 General 보다 더 많은 방법조합 (정규화 및 연관성) 사이에서 유의함을 나타내었으며, 가장 유의한 P-value 를 나타낸 것도 random 과의 비교 (Mouse, 29 개 조합, P = 4.2E-15; Human, 20 개 조합, P = 2.7E-06)에서 였다.

위와 같은 결과는 타당함과 동시에 여전히 General PPIs 와 조건 특이적인 PPIs 사이의 유사함을 의미하고 있으며, 이는 General PPIs 에 여러 조건 특이적인 PPIs 가 혼재되어 있음을 말해주고 있다. 실제로 본 연구에서 구축한 조건 특이적인 PPIs 데이터는 General PPIs 에서 유래한 일부에 해당한다.

연관성 측정법 및 정규화 방법을 비교한 결과는 흥미롭게도 MIk 방법이 가장 좋은 성능을 나타낸 공동발현 측정법 (Reshef 등, 2011; Song 등, 2012) 이었으며, Mouse 에 있어서는 추가적인 정규화를 하지 않는 raw 데이터에서 가장 좋은 성능을 보여주었다. 이는 본 연구에서 사용한 데이터가 이미 충분한 표본

간 오차가 교정되어있음을 의미한다. 이어 유의하게 나온 정규화 방법은

데이터에 대하여 Interlog 맵핑을 적용하여 ESC 및 다른 조건에서 추가연구를 진행할 수 있을 것이다. 또한, 마이크로어레이 데이터의 정규화 방법 외에 다양한 이산화 방법을 적용하고, 여러 마이크로어레이 데이터를 통합하여 보다 포괄적인 비교분석을 남겨두고 있다.

V. 결론

본 연구를 통해 마이크로어레이를 통한 조건 특이적인 PPIs 예측정도를 여러 방법들 사이에서 비교하고, 새로운 PPIs 예측을 위한 참고정보를 제공하였다.

기존에 알려진 General PPIs 사이의 유전자 공동발현 결과와 일치하게 Condition-specific PPIs 에서는 General PPIs 보다 더 높은 연관성을 나타내었으며, 이러한 결과를 바탕으로 새로운 Condition-specific PPIs 를 예측함에 있어 적절한 방법을 사용하도록 제안할 수 있다.

또한, 추가연구로 다른 세포조건의 PPI 데이터 및 여러 GEP 데이터셋을 통합하여 보다 일관된 결과를 얻을 필요가 있으며, 해당 결과를 바탕으로 GEP 데이터 상의 전체 유전자쌍에 대하여 공동발현 측정을 통한 Condition-specific PPI 를 예측하고 이를 실험적으로 검증해야할 것이다.

참고문헌

1. Anonymous: <2007, Syst. Biol. and Comput. Proteomics, A Context-Specific Network of Protein-DNA and Protein-Protein Interactions Reveals New Regulatory Motifs in Human B Cells, Lefebvre et al.pdf>.

2. Anonymous: <2012, BMC bioinformatics, Comparison of co-expression measures mutual information, correlation, and model based indices., Song, Langfelder, Horvath.pdf>.

3. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H et al.: Gene ontology:

tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1):25-29, 2000

archive for functional genomics data sets--update. Nucleic Acids Res 41(Database issue):D991-995, 2013

8. Bhardwaj N, Lu H: Correlation between gene expression profiles and protein-protein interactions within and across genomes. Bioinformatics 21(11):2730-2738, 2005

9. Bolstad BM, Irizarry RA, Astrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2):185-193, 2003

10. Bossi A, Lehner B: Tissue specificity and the human protein interaction network. Mol Syst Biol 5:260, 2009

11. Brown KR, Jurisica I: Unequal evolutionary conservation of human protein interactions in interologous networks. Genome Biol 8(5):R95, 2007

12. Cai J, Xie D, Fan Z, Chipperfield H, Marden J et al.: Modeling co-expression across species for complex traits: insights to the difference of human and

mouse embryonic stem cells. PLoS Comput Biol 6(3):e1000707, 2010

13. Chikina MD, Huttenhower C, Murphy CT, Troyanskaya OG: Global prediction of tissue-specific gene expression and context-dependent gene networks in Caenorhabditis elegans. PLoS Comput Biol 5(6):e1000417, 2009

14. Das J, Mohammed J, Yu H: Genome-scale analysis of interaction dynamics reveals organization of biological networks. Bioinformatics 28(14):1873-1878, 2012

15. Duggan DJ, Bittner M, Chen Y, Meltzer P, Trent JM: Expression profiling using cDNA microarrays. Nat Genet 21(1 Suppl):10-14, 1999

16. Ewing RM, Chu P, Elisma F, Li H, Taylor P et al.: Large-scale mapping of human protein-protein interactions by mass spectrometry. Mol Syst Biol 3:89, 2007

17. Gitter A, Carmi M, Barkai N, Bar-Joseph Z: Linking the signaling cascades and dynamic regulatory networks controlling stress responses. Genome Res 23(2):365-376, 2013

18. Guan Y, Gorenshteyn D, Burmeister M, Wong AK, Schimenti JC et al.: Tissue-specific functional networks for prioritizing phenotype and disease genes. PLoS Comput Biol 8(9):e1002694, 2012

19. Ideker T, Krogan NJ: Differential network biology. Mol Syst Biol 8:565, 2012 20. Jansen R, Greenbaum D, Gerstein M: Relating whole-genome expression data

with protein-protein interactions. Genome Res 12(1):37-46, 2002

21. Lee HK, Hsu AK, Sajdak J, Qin J, Pavlidis P: Coexpression analysis of human genes across many microarray data sets. Genome Res 14(6):1085-1094, 2004

22. Lee K, Byun K, Hong W, Chuang HY, Pack CG et al.: Proteome-wide discovery of mislocated proteins in cancer. Genome Res, 2013

23. Lefebvre C, Rajbhandari P, Alvarez MJ, Bandaru P, Lim WK et al.: A human B-cell interactome identifies MYB and FOXM1 as master regulators of proliferation in germinal centers. Mol Syst Biol 6:377, 2010

24. Lim WK, Wang K, Lefebvre C, Califano A: Comparative analysis of microarray normalization procedures: effects on reverse engineering gene networks.

Bioinformatics 23(13):i282-288, 2007

25. Lin CC, Hsiang JT, Wu CY, Oyang YJ, Juan HF et al.: Dynamic functional modules in co-expressed protein interaction networks of dilated

cardiomyopathy. BMC Syst Biol 4:138, 2010

26. Lin J, Wilbur WJ: PubMed related articles: a probabilistic topic-based model for content similarity. BMC Bioinformatics 8:423, 2007

27. Lipscomb CE: Medical Subject Headings (MeSH). Bull Med Libr Assoc 88(3):265-266, 2000

28. Mani KM, Lefebvre C, Wang K, Lim WK, Basso K et al.: A systems biology approach to prediction of oncogenes and molecular perturbation targets in B-cell lymphomas. Mol Syst Biol 4:169, 2008

29. Perissi V, Aggarwal A, Glass CK, Rose DW, Rosenfeld MG: A corepressor/coactivator exchange complex required for transcriptional activation by nuclear receptors and other regulated transcription factors. Cell 116(4):511-526, 2004

30. Pop A, Huttenhower C, Iyer-Pascuzzi A, Benfey PN, Troyanskaya OG:

Integrated functional networks of process, tissue, and developmental stage specific interactions in Arabidopsis thaliana. BMC Syst Biol 4:180, 2010 31. Priness I, Maimon O, Ben-Gal I: Evaluation of gene-expression clustering via

mutual information distance measure. BMC Bioinformatics 8:111, 2007

32. Przytycka TM, Singh M, Slonim DK: Toward the dynamic interactome: it's about time. Brief Bioinform 11(1):15-29, 2010

33. Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G et al.: Detecting novel associations in large data sets. Science 334(6062):1518-1524, 2011

34. Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A et al.: Towards a proteome-scale map of the human protein-protein interaction network. Nature 437(7062):1173-1178, 2005

35. Shoemaker BA, Panchenko AR: Deciphering protein-protein interactions. Part II. Computational methods to predict protein and domain interaction partners.

PLoS Comput Biol 3(4):e43, 2007

36. Skrabanek L, Saini HK, Bader GD, Enright AJ: Computational prediction of protein-protein interactions. Mol Biotechnol 38(1):1-17, 2008

37. Smialowski P, Pagel P, Wong P, Brauner B, Dunger I et al.: The Negatome database: a reference set of non-interacting protein pairs. Nucleic Acids Res 38(Database issue):D540-544, 2010

38. Song L, Langfelder P, Horvath S: Comparison of co-expression measures:

mutual information, correlation, and model based indices. BMC Bioinformatics 13:328, 2012

39. Sprinzak E, Altuvia Y, Margalit H: Characterization and prediction of protein-protein interactions within and between complexes. Proc Natl Acad Sci U S A 103(40):14718-14723, 2006

40. Sung MK, Lim G, Yi DG, Chang YJ, Yang EB et al.: Genome-wide bimolecular fluorescence complementation analysis of SUMO interactome in yeast.

Genome Res 23(4):736-746, 2013

41. Torkamani A, Dean B, Schork NJ, Thomas EA: Coexpression network analysis of neural tissue reveals perturbations in developmental processes in schizophrenia. Genome Res 20(4):403-412, 2010

42. Turner B, Razick S, Turinsky AL, Vlasblom J, Crowdy EK et al.: iRefWeb:

interactive analysis of consolidated protein interaction data and their supporting evidence. Database (Oxford) 2010:baq023, 2010

43. von Mering C, Krause R, Snel B, Cornell M, Oliver SG et al.: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417(6887):399-403, 2002

44. Wang J, Huang Q, Liu ZP, Wang Y, Wu LY et al.: NOA: a novel Network Ontology Analysis method. Nucleic Acids Res 39(13):e87, 2011

45. Wang K, Saito M, Bisikirska BC, Alvarez MJ, Lim WK et al.: Genome-wide identification of post-translational modulators of transcription factor activity in human B cells. Nat Biotechnol 27(9):829-839, 2009

46. Xulvi-Brunet R, Li H: Co-expression networks: graph properties and topological comparisons. Bioinformatics 26(2):205-214, 2010

47. Yoon D, Kim H, Suh-Kim H, Park RW, Lee K: Differentially co-expressed interacting protein pairs discriminate samples under distinct stages of HIV type 1 infection. BMC Syst Biol 5 Suppl 2:S1, 2011

48. Yousefi M, Hajihoseini V, Jung W, Hosseinpour B, Rassouli H et al.: Embryonic stem cell interactomics: the beginning of a long road to biological function. Stem Cell Rev 8(4):1138-1154, 2012

49. Zhao W, Langfelder P, Fuller T, Dong J, Li A et al.: Weighted gene coexpression network analysis: state of the art. J Biopharm Stat 20(2):281-300, 2010

- ABSTRACT -

Protein-protein interactions (PPIs) are functional core units within the cells that make up the vast networks and specific functions. The functional status of each cell depends on its PPIs network configuration. Most of publicly available PPI databases until now, annotation information about cellular condition is not commented even though individual PPIs experimentally assayed have their own cellular conditions. The absence of cellular conditions for specific PPIs gives limitations to accurately predict near physiological PPIs.

On the other hand, a number of gene expression profiles (GEP) data are available within various cellular conditions, but there are little attempts in predicting condition-specific PPIs using GEP data.

In this study, condition-specific PPIs (on stem cells and tumor cells) are built by integrating public PPI databases and by curating literatures. Using condition-specific PPIs above, several normalization methods and gene co-expression measures are comprehensively compared in terms of predicting embryonic stem cells(ESC)–specific PPIs. This study aims to propose the proper methodology to predict condition-specific PPIs using GEP data.

Abstracts of individual papers derived from generally known PPIs are searched using keywords of stem cells and cancers and curated manually for the details of stem cells and cancer subtype as annotation information. For GEP data, we collected the profiles of human and mouse stem cells from the NCBI GEO. Then we preprocessed the collected GEP data using six different normalization methods and caculated co-expression degree of ESC-specific and general PPIs accordingly. Finally we compared the result of caculated co-expression degree in an unbiased manner and found co-co-expression measures of most

enlarging difference between ESC-specific and general PPIs.

The literature-curated condition-specific PPIs database consists of a total of 4,161 proteins and 8,347 interactions, specially 371 proteins and 603 interactions in the ESC which is the greatest amount of stem cells. In Human, the quantile and MIk measure while not further nomalized and MIk measures in Mouse shown as the most significant methods in terms of ESC-specific PPIs. We found also the Boundary normalized data and MI-based co-expression measures (MI, MIk) shown to have significant difference compared to the general PPIs.

In this paper, we first constructed condition-specific PPIs and comprehensively compared the co-expression mearsures and normalization methods which could assist in predicting genome-wide and condition-specific PPIs directly from GEPs data. Our data also could find a certain conditional PPIs when compared with general and other conditional PPIs and this network may reveal in-depth mechanism of cellular condition of interest. The constructed database of condition-specific PPIs can be used as a reference to predict novel PPIs from GEPs data of stem cells and cancers.

Keywords : Protein-protein interactions (PPIs), Context/Condition-specific PPIs, Gene expression profiles, Gene co-expresion, Pair association, Mutual information, Quantile normalization, Embryonic stem cell (ESC), ESC-specific PPIs

관련 문서