HISAT: a fast spliced aligner with low memory requirements

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of ∼64,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.

Related collections

Most cited references 7

Record: found
Abstract: found
Article: not found

Computational methods for transcriptome annotation and quantification using RNA-seq.

Manuel Garber, Manfred Grabherr, Mitchell Guttman … (2011)

High-throughput RNA sequencing (RNA-seq) promises a comprehensive picture of the transcriptome, allowing for the complete annotation and quantification of all genes and their isoforms across samples. Realizing this promise requires increasingly complex computational methods. These computational challenges fall into three main categories: (i) read mapping, (ii) transcriptome reconstruction and (iii) expression quantification. Here we explain the major conceptual and practical challenges, and the general classes of solutions for each category. Finally, we highlight the interdependence between these categories and discuss the benefits for different biological applications.

0 comments Cited 408 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Systematic evaluation of spliced alignment programs for RNA-seq data

Pär G. Engström, Tamara Steijger, Botond Sipos … (2013)

High-throughput RNA sequencing is an increasingly accessible method for studying gene structure and activity on a genome-wide scale. A critical step in RNA-seq data analysis is the alignment of partial transcript reads to a reference genome sequence. to assess the performance of current mapping software, we invited developers of RNA-seq aligners to process four large human and mouse RNA-seq data sets. in total, we compared 26 mapping protocols based on 11 programs and pipelines and found major performance differences between methods on numerous benchmarks, including alignment yield, basewise accuracy, mismatch and gap placement, exon junction discovery and suitability of alignments for transcript reconstruction. We observed concordant results on real and simulated RNA-seq data, confirming the relevance of the metrics employed. Future developments in RNA-seq alignment methods would benefit from improved placement of multimapped reads, balanced utilization of existing gene annotation and a reduced false discovery rate for splice junctions.

0 comments Cited 243 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Post-transcriptional processing generates a diversity of 5'-modified long and short RNAs.

, (2009)

The transcriptomes of eukaryotic cells are incredibly complex. Individual non-coding RNAs dwarf the number of protein-coding genes, and include classes that are well understood as well as classes for which the nature, extent and functional roles are obscure. Deep sequencing of small RNAs (<200 nucleotides) from human HeLa and HepG2 cells revealed a remarkable breadth of species. These arose both from within annotated genes and from unannotated intergenic regions. Overall, small RNAs tended to align with CAGE (cap-analysis of gene expression) tags, which mark the 5' ends of capped, long RNA transcripts. Many small RNAs, including the previously described promoter-associated small RNAs, appeared to possess cap structures. Members of an extensive class of both small RNAs and CAGE tags were distributed across internal exons of annotated protein coding and non-coding genes, sometimes crossing exon-exon junctions. Here we show that processing of mature mRNAs through an as yet unknown mechanism may generate complex populations of both long and short RNAs whose apparently capped 5' ends coincide. Supplying synthetic promoter-associated small RNAs corresponding to the c-MYC transcriptional start site reduced MYC messenger RNA abundance. The studies presented here expand the catalogue of cellular small RNAs and demonstrate a biological impact for at least one class of non-canonical small RNAs.

0 comments Cited 152 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Title: Nature Methods

Abbreviated Title: Nat Methods

Publisher: Springer Science and Business Media LLC

ISSN (Print): 1548-7091

ISSN (Electronic): 1548-7105

Publication date Created: April 2015

Publication date (Electronic): March 9 2015

Publication date (Print): April 2015

Volume: 12

Issue: 4

Pages: 357-360

Article

DOI: 10.1038/nmeth.3317

PubMed ID: 25751142

SO-VID: ec171455-72e6-4800-a90b-ae3ba37bc797

License:

http://www.springer.com/tdm

History

Data availability:

Comments

Comment on this article

scite_

Cited by 8,024

See all cited by

- Version 1
- Version 1

HISAT: a fast spliced aligner with low memory requirements

Read this article at

Abstract

Related collections

Embodied Memory

Most cited references 7

Computational methods for transcriptome annotation and quantification using RNA-seq.

Systematic evaluation of spliced alignment programs for RNA-seq data

Post-transcriptional processing generates a diversity of 5'-modified long and short RNAs.

Author and article information

Journal

Article

History

Comments

Comment on this article

Similar content 575

Cited by 8,024