Block-Constraint Robust Principal Component Analysis and its Application to Integrated Analysis of TCGA Data

Vol. 15, No. 6

The Cancer Genome Atlas (TCGA) dataset provides us more opportunities to systematically and comprehensively learn some biological mechanism of cancers formation, growth and metastasis. Since TCGA dataset includes heterogeneous data, it is one of the bioinformatics bottlenecks to mine some meaningful information from them. In this paper, to improve the performance of Robust Principal Component Analysis (RPCA) analyzing these heterogeneous data, a modified RPCA-based method, Block-Constraint Robust Principal Component Analysis (BCRPCA), is proposed. Since different categories data have different peculiarities, BCRPCA enforces different constraint intensities on different categories to improve the performance of RPCA. Firstly, the observation matrix of TCGA data is decomposed into two adding matrices A and S by using BCRPCA. Secondly, we use a ranking scheme to evaluate every feature and project these features to the genes. Then, the genes with high scores will be identified as differentially expressed ones. The main contributions of this paper are as following: firstly, it proposes, for the first time, the idea and method of BCRPCA to model TCGA data; secondly, it provides a BCRPCA-based framework for integrated analysis of TCGA data. The results show that our method is effective and suitable to analyze these data.

READ FULL ARTICLE ON IEEE XPLORE

Login

New Here? Sign Up

Looking for increased exposure in the field of biomedical engineering? EMBS offers journals, conferences and a community for biomedical engineers. Membership includes PULSE Magazine.

Join EMBS