Prediction of condition-specific regulatory genes using machine learning

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Recent advances in genomic technologies have generated data on large-scale protein–DNA interactions and open chromatin regions for many eukaryotic species. How to identify condition-specific functions of transcription factors using these data has become a major challenge in genomic research. To solve this problem, we have developed a method called ConSReg, which provides a novel approach to integrate regulatory genomic data into predictive machine learning models of key regulatory genes. Using Arabidopsis as a model system, we tested our approach to identify regulatory genes in data sets from single cell gene expression and from abiotic stress treatments. Our results showed that ConSReg accurately predicted transcription factors that regulate differentially expressed genes with an average auROC of 0.84, which is 23.5–25% better than enrichment-based approaches. To further validate the performance of ConSReg, we analyzed an independent data set related to plant nitrogen responses. ConSReg provided better rankings of the correct transcription factors in 61.7% of cases, which is three times better than other plant tools. We applied ConSReg to Arabidopsis single cell RNA-seq data, successfully identifying candidate regulatory genes that control cell wall formation. Our methods provide a new approach to define candidate regulatory genes using integrated genomic data in plants.

Related collections

Most cited references 82

Record: found
Abstract: not found
Article: not found

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Yoav Benjamini, Yosef Hochberg (1995)

0 comments Cited 27144 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Stability selection

Nicolai Meinshausen, Peter Bühlmann (2010)

0 comments Cited 665 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

High-resolution mapping and characterization of open chromatin across the genome.

Alan Boyle, Sean Davis, Hennady Shulha … (2008)

Mapping DNase I hypersensitive (HS) sites is an accurate method of identifying the location of genetic regulatory elements, including promoters, enhancers, silencers, insulators, and locus control regions. We employed high-throughput sequencing and whole-genome tiled array strategies to identify DNase I HS sites within human primary CD4+ T cells. Combining these two technologies, we have created a comprehensive and accurate genome-wide open chromatin map. Surprisingly, only 16%-21% of the identified 94,925 DNase I HS sites are found in promoters or first exons of known genes, but nearly half of the most open sites are in these regions. In conjunction with expression, motif, and chromatin immunoprecipitation data, we find evidence of cell-type-specific characteristics, including the ability to identify transcription start sites and locations of different chromatin marks utilized in these cells. In addition, and unexpectedly, our analyses have uncovered detailed features of nucleosome structure.

0 comments Cited 588 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (iso-abbrev): Nucleic Acids Res

Journal ID (publisher-id): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date (Print): 19 June 2020

Publication date (Electronic): 24 April 2020

Publication date PMC-release: 24 April 2020

Volume: 48

Issue: 11

Page: e62

Affiliations

[1 ] Graduate program in Genetics, Bioinformatics and Computational Biology . Virginia Tech., Blacksburg, VA 24061, USA

[2 ] School of Plant and Environmental Sciences . Virginia Tech., Blacksburg, VA 24061, USA

[3 ] Department of Statistics . Virginia Tech., Blacksburg, VA 24061, USA

Author notes

To whom correspondence should be addressed. Tel: +1 540 231 2756; Email: songli@ 123456vt.edu

Author information

Song Li http://orcid.org/0000-0002-8133-3944

Article

Publisher ID: gkaa264

DOI: 10.1093/nar/gkaa264

PMC ID: 7293043

PubMed ID: 32329779

SO-VID: f019ae71-2679-4ae7-8b53-f0d49e956378

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@ 123456oup.com

History

Date accepted : 20 April 2020

Date revision received : 19 February 2020

Date received : 06 November 2019

Page count

Pages: 17

Funding

Funded by: Jeffress Trust, DOI 10.13039/100006990;

Funded by: United States Department of Energy;

Award ID: DE-SC0020358

Funded by: United States Department of Agriculture, DOI 10.13039/100000199;

Comments

Comment on this article

scite_

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

Cited by 20

See all cited by

Most referenced authors 2,496

See all reference authors

Prediction of condition-specific regulatory genes using machine learning

Read this article at

Abstract

Related collections

Genome Engineering using CRISPR

Most cited references 82

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Stability selection

High-resolution mapping and characterization of open chromatin across the genome.

Author and article information

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 156

Cited by 20

Most referenced authors 2,496