Abstract
Background: High-throughput gene expression profiles have allowed discovery of potential biomarkers enablingearly diagnosis, prognosis and developing individualized treatment. However, it remains a challenge to identify a setof reliable and reproducible biomarkers across various gene expression platforms and laboratories for single samplediagnosis and prognosis. We address this need with our Data-Driven Reference (DDR) approach, which employs stablyexpressed housekeeping genes as references to eliminate platform-specific biases and non-biological variabilities.
Results: Our method identifies biomarkers with “built-in” features, and these features can be interpreted consistentlyregardless of profiling technology, which enable classification of single-sample independent of platforms. Validationwith RNA-seq data of blood platelets shows that DDR achieves the superior performance in classification of sixdifferent tumor types as well as molecular target statuses (such asMETorHER2-positive, and mutantKRAS,EGFRorPIK3CA) with smaller sets of biomarkers. We demonstrate on the three microarray datasets that our method is capableof identifying robust biomarkers for subgrouping medulloblastoma samples with data perturbation due to differentmicroarray platforms. In addition to identifying the majority of subgroup-specific biomarkers in CodeSet ofnanoString, some potential new biomarkers for subgrouping medulloblastoma were detected by our method.
Conclusions: In this study, we present a simple, yet powerful data-driven method which contributes significantly toidentification of robust cross-platform gene signature for disease classification of single-patient to facilitate precisionmedicine. In addition, our method provides a new strategy for transcriptome analysis.
Results: Our method identifies biomarkers with “built-in” features, and these features can be interpreted consistentlyregardless of profiling technology, which enable classification of single-sample independent of platforms. Validationwith RNA-seq data of blood platelets shows that DDR achieves the superior performance in classification of sixdifferent tumor types as well as molecular target statuses (such asMETorHER2-positive, and mutantKRAS,EGFRorPIK3CA) with smaller sets of biomarkers. We demonstrate on the three microarray datasets that our method is capableof identifying robust biomarkers for subgrouping medulloblastoma samples with data perturbation due to differentmicroarray platforms. In addition to identifying the majority of subgroup-specific biomarkers in CodeSet ofnanoString, some potential new biomarkers for subgrouping medulloblastoma were detected by our method.
Conclusions: In this study, we present a simple, yet powerful data-driven method which contributes significantly toidentification of robust cross-platform gene signature for disease classification of single-patient to facilitate precisionmedicine. In addition, our method provides a new strategy for transcriptome analysis.
Originalsprache | Englisch |
---|---|
Fachzeitschrift | BMC Bioinformatics |
Jahrgang | 20 |
Ausgabenummer | 601 |
DOIs | |
Publikationsstatus | Veröffentlicht - 2019 |