84
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Regularized gene selection in cancer microarray meta-analysis

      research-article
      1 , , 2
      BMC Bioinformatics
      BioMed Central

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          In cancer studies, it is common that multiple microarray experiments are conducted to measure the same clinical outcome and expressions of the same set of genes. An important goal of such experiments is to identify a subset of genes that can potentially serve as predictive markers for cancer development and progression. Analyses of individual experiments may lead to unreliable gene selection results because of the small sample sizes. Meta analysis can be used to pool multiple experiments, increase statistical power, and achieve more reliable gene selection. The meta analysis of cancer microarray data is challenging because of the high dimensionality of gene expressions and the differences in experimental settings amongst different experiments.

          Results

          We propose a Meta Threshold Gradient Descent Regularization (MTGDR) approach for gene selection in the meta analysis of cancer microarray data. The MTGDR has many advantages over existing approaches. It allows different experiments to have different experimental settings. It can account for the joint effects of multiple genes on cancer, and it can select the same set of cancer-associated genes across multiple experiments. Simulation studies and analyses of multiple pancreatic and liver cancer experiments demonstrate the superior performance of the MTGDR.

          Conclusion

          The MTGDR provides an effective way of analyzing multiple cancer microarray studies and selecting reliable cancer-associated genes.

          Related collections

          Most cited references31

          • Record: found
          • Abstract: found
          • Article: not found

          RankProd: a bioconductor package for detecting differentially expressed genes in meta-analysis.

          While meta-analysis provides a powerful tool for analyzing microarray experiments by combining data from multiple studies, it presents unique computational challenges. The Bioconductor package RankProd provides a new and intuitive tool for this purpose in detecting differentially expressed genes under two experimental conditions. The package modifies and extends the rank product method proposed by Breitling et al., [(2004) FEBS Lett., 573, 83-92] to integrate multiple microarray studies from different laboratories and/or platforms. It offers several advantages over t-test based methods and accepts pre-processed expression datasets produced from a wide variety of platforms. The significance of the detection is assessed by a non-parametric permutation test, and the associated P-value and false discovery rate (FDR) are included in the output alongside the genes that are detected by user-defined criteria. A visualization plot is provided to view actual expression levels for each gene with estimated significance measurements. RankProd is available at Bioconductor http://www.bioconductor.org. A web-based interface will soon be available at http://cactus.salk.edu/RankProd
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data.

            An important application of microarray technology is to relate gene expression profiles to various clinical phenotypes of patients. Success has been demonstrated in molecular classification of cancer in which the gene expression data serve as predictors and different types of cancer serve as a categorical outcome variable. However, there has been less research in linking gene expression profiles to the censored survival data such as patients' overall survival time or time to cancer relapse. It would be desirable to have models with good prediction accuracy and parsimony property. We propose to use the L(1) penalized estimation for the Cox model to select genes that are relevant to patients' survival and to build a predictive model for future prediction. The computational difficulty associated with the estimation in the high-dimensional and low-sample size settings can be efficiently solved by using the recently developed least-angle regression (LARS) method. Our simulation studies and application to real datasets on predicting survival after chemotherapy for patients with diffuse large B-cell lymphoma demonstrate that the proposed procedure, which we call the LARS-Cox procedure, can be used for identifying important genes that are related to time to death due to cancer and for building a parsimonious model for predicting the survival of future patients. The LARS-Cox regression gives better predictive performance than the L(2) penalized regression and a few other dimension-reduction based methods. We conclude that the proposed LARS-Cox procedure can be very useful in identifying genes relevant to survival phenotypes and in building a parsimonious predictive model that can be used for classifying future patients into clinically relevant high- and low-risk groups based on the gene expression profile and survival times of previous patients.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Molecular profiling of pancreatic adenocarcinoma and chronic pancreatitis identifies multiple genes differentially regulated in pancreatic cancer.

              The molecular basis of pancreatic cancer is not understood. Previous attempts to determine the specific genes expressed in pancreatic cancer have been hampered by similarities between adenocarcinoma and chronic pancreatitis. In the current study, microarrays (Affymetrix) were used to profile gene expression in pancreatic adenocarcinoma (10), pancreatic cancer cell lines (7), chronic pancreatitis (5), and normal pancreas (5). Molecular profiling indicated a large number of genes differentially expressed between pancreatic cancer and normal pancreas but many fewer differences between pancreatic cancer and chronic pancreatitis, likely because of the shared stromal influences in the two diseases. To specifically identify genes expressed in neoplastic epithelium, we selected genes more highly expressed (>2-fold, p < 0.01) in adenocarcinoma compared with both normal pancreas and chronic pancreatitis and which were also highly expressed in pancreatic cancer cell lines. This strategy yielded 158 genes, of which 124 were not previously associated with pancreatic cancer. Quantitative-reverse transcription-PCR for two molecules, S100P and 14-3-3sigma, validated the microarray data. Support for the success of the neoplastic cell gene expression identification strategy was obtained by immunocytochemical localization of four representative genes, 14-3-3sigma, S100P, S100A6, and beta4 integrin, to neoplastic cells in pancreatic tumors. Thus, comparisons between pancreatic adenocarcinoma, pancreatic cancer cell lines, normal pancreas, and chronic pancreatitis have identified genes that are selectively expressed in the neoplastic epithelium of pancreatic adenocarcinoma. These data provide new insights into the molecular pathology of pancreatic cancer that may be useful for detection, diagnosis, and treatment.
                Bookmark

                Author and article information

                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central
                1471-2105
                2009
                1 January 2009
                : 10
                : 1
                Affiliations
                [1 ]Department of Epidemiology and Public Health, Yale University, New Haven, CT 06520, USA
                [2 ]Department of Statistics and Actuarial Science, University of Iowa, Iowa City, IA 52242, USA
                Article
                1471-2105-10-1
                10.1186/1471-2105-10-1
                2631520
                19118496
                1c366b28-cb71-49df-915f-43f459dff429
                Copyright © 2009 Ma and Huang; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 18 September 2008
                : 1 January 2009
                Categories
                Methodology Article

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article