35
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A novel approach for data integration and disease subtyping

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Advances in high-throughput technologies allow for measurements of many types of omics data, yet the meaningful integration of several different data types remains a significant challenge. Another important and difficult problem is the discovery of molecular disease subtypes characterized by relevant clinical differences, such as survival. Here we present a novel approach, called perturbation clustering for data integration and disease subtyping (PINS), which is able to address both challenges. The framework has been validated on thousands of cancer samples, using gene expression, DNA methylation, noncoding microRNA, and copy number variation data available from the Gene Expression Omnibus, the Broad Institute, The Cancer Genome Atlas (TCGA), and the European Genome-Phenome Archive. This simultaneous subtyping approach accurately identifies known cancer subtypes and novel subgroups of patients with significantly different survival profiles. The results were obtained from genome-scale molecular data without any other type of prior knowledge. The approach is sufficiently general to replace existing unsupervised clustering approaches outside the scope of bio-medical research, with the additional ability to integrate multiple types of data.

          Related collections

          Most cited references102

          • Record: found
          • Abstract: not found
          • Article: not found

          Regression Shrinkage and Selection Via the Lasso

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking

            Summary: Unsupervised class discovery is a highly useful technique in cancer research, where intrinsic groups sharing biological characteristics may exist but are unknown. The consensus clustering (CC) method provides quantitative and visual stability evidence for estimating the number of unsupervised classes in a dataset. ConsensusClusterPlus implements the CC method in R and extends it with new functionality and visualizations including item tracking, item-consensus and cluster-consensus plots. These new features provide users with detailed information that enable more specific decisions in unsupervised class discovery. Availability: ConsensusClusterPlus is open source software, written in R, under GPL-2, and available through the Bioconductor project (http://www.bioconductor.org/). Contact: mwilkers@med.unc.edu Supplementary Information: Supplementary data are available at Bioinformatics online.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found
              Is Open Access

              Comprehensive molecular portraits of human breast tumors

              Summary We analyzed primary breast cancers by genomic DNA copy number arrays, DNA methylation, exome sequencing, mRNA arrays, microRNA sequencing and reverse phase protein arrays. Our ability to integrate information across platforms provided key insights into previously-defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity. Somatic mutations in only three genes (TP53, PIK3CA and GATA3) occurred at > 10% incidence across all breast cancers; however, there were numerous subtype-associated and novel gene mutations including the enrichment of specific mutations in GATA3, PIK3CA and MAP3K1 with the Luminal A subtype. We identified two novel protein expression-defined subgroups, possibly contributed by stromal/microenvironmental elements, and integrated analyses identified specific signaling pathways dominant in each molecular subtype including a HER2/p-HER2/HER1/p-HER1 signature within the HER2-Enriched expression subtype. Comparison of Basal-like breast tumors with high-grade Serous Ovarian tumors showed many molecular commonalities, suggesting a related etiology and similar therapeutic opportunities. The biologic finding of the four main breast cancer subtypes caused by different subsets of genetic and epigenetic abnormalities raises the hypothesis that much of the clinically observable plasticity and heterogeneity occurs within, and not across, these major biologic subtypes of breast cancer.
                Bookmark

                Author and article information

                Journal
                Genome Res
                Genome Res
                genome
                genome
                GENOME
                Genome Research
                Cold Spring Harbor Laboratory Press
                1088-9051
                1549-5469
                December 2017
                : 27
                : 12
                : 2025-2039
                Affiliations
                [1 ]Department of Computer Science and Engineering, University of Nevada, Reno, Nevada 89557, USA;
                [2 ]Department of Computer Science, Wayne State University, Detroit, Michigan 48202, USA;
                [3 ]Department of Obstetrics and Gynecology, Wayne State University, Detroit, Michigan 48201, USA
                Author notes
                Corresponding author: sorin@ 123456wayne.edu
                Author information
                http://orcid.org/0000-0001-8001-9470
                http://orcid.org/0000-0002-0786-8377
                Article
                9509184
                10.1101/gr.215129.116
                5741060
                29066617
                abec0f56-7e8a-4b73-9aa0-072da90188e5
                © 2017 Nguyen et al.; Published by Cold Spring Harbor Laboratory Press

                This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

                History
                : 24 August 2016
                : 13 September 2017
                Page count
                Pages: 15
                Funding
                Funded by: National Institutes of Health , open-funder-registry 10.13039/100000002;
                Award ID: R01 DK089167
                Award ID: R42 GM087013
                Funded by: National Science Foundation , open-funder-registry 10.13039/100000001;
                Award ID: DBI-0965741
                Funded by: Wayne State University , open-funder-registry 10.13039/100006710;
                Categories
                Method

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content175

                Cited by70

                Most referenced authors3,560