Extreme genome reduction in symbiotic bacteria

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Related collections

Most cited references 118

Record: found
Abstract: found
Article: not found

RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.

Alexandros Stamatakis (2006)

RAxML-VI-HPC (randomized axelerated maximum likelihood for high performance computing) is a sequential and parallel program for inference of large phylogenies with maximum likelihood (ML). Low-level technical optimizations, a modification of the search algorithm, and the use of the GTR+CAT approximation as replacement for GTR+Gamma yield a program that is between 2.7 and 52 times faster than the previous version of RAxML. A large-scale performance comparison with GARLI, PHYML, IQPNNI and MrBayes on real data containing 1000 up to 6722 taxa shows that RAxML requires at least 5.6 times less main memory and yields better trees in similar times than the best competing program (GARLI) on datasets up to 2500 taxa. On datasets > or =4000 taxa it also runs 2-3 times faster than GARLI. RAxML has been parallelized with MPI to conduct parallel multiple bootstraps and inferences on distinct starting trees. The program has been used to compute ML trees on two of the largest alignments to date containing 25,057 (1463 bp) and 2182 (51,089 bp) taxa, respectively. icwww.epfl.ch/~stamatak

0 comments Cited 1500 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The COG database: a tool for genome-scale analysis of protein functions and evolution.

R. L. Tatusov (2000)

Rational classification of proteins encoded in sequenced genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete genomes of bacteria, archaea and eukaryotes (http://www. ncbi.nlm. nih.gov/COG). The COGs were constructed by applying the criterion of consistency of genome-specific best hits to the results of an exhaustive comparison of all protein sequences from these genomes. The database comprises 2091 COGs that include 56-83% of the gene products from each of the complete bacterial and archaeal genomes and approximately 35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly sequenced genomes.

0 comments Cited 587 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

MAFFT version 5: improvement in accuracy of multiple sequence alignment

emmanuel boutet, Kazutaka Katoh, Kei-ichi Kuma … (2005)

The accuracy of multiple sequence alignment program MAFFT has been improved. The new version (5.3) of MAFFT offers new iterative refinement options, H-INS-i, F-INS-i and G-INS-i, in which pairwise alignment information are incorporated into objective function. These new options of MAFFT showed higher accuracy than currently available methods including TCoffee version 2 and CLUSTAL W in benchmark tests consisting of alignments of >50 sequences. Like the previously available options, the new options of MAFFT can handle hundreds of sequences on a standard desktop computer. We also examined the effect of the number of homologues included in an alignment. For a multiple alignment consisting of ∼8 sequences with low similarity, the accuracy was improved (2–10 percentage points) when the sequences were aligned together with dozens of their close homologues (E-value < 10−5–10−20) collected from a database. Such improvement was generally observed for most methods, but remarkably large for the new options of MAFFT proposed here. Thus, we made a Ruby script, mafftE.rb, which aligns the input sequences together with their close homologues collected from SwissProt using NCBI-BLAST.

0 comments Cited 532 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Title: Nature Reviews Microbiology

Abbreviated Title: Nat Rev Microbiol

Publisher: Springer Science and Business Media LLC

ISSN (Print): 1740-1526

ISSN (Electronic): 1740-1534

Publication date Created: January 2012

Publication date (Electronic): November 8 2011

Publication date (Print): January 2012

Volume: 10

Issue: 1

Pages: 13-26

Article

DOI: 10.1038/nrmicro2670

PubMed ID: 22064560

SO-VID: 1740d024-24d9-4ced-9f96-66f186cc703c

License:

http://www.springer.com/tdm

History

Data availability:

Comments

Comment on this article

scite_

Cited by 523

See all cited by

- Version 1
- Version 1

Extreme genome reduction in symbiotic bacteria

Read this article at

Related collections

Model Reduction of Parametrized Systems 2015

Most cited references 118

RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.

The COG database: a tool for genome-scale analysis of protein functions and evolution.

MAFFT version 5: improvement in accuracy of multiple sequence alignment

Author and article information

Journal

Article

History

Comments

Comment on this article

Similar content 2,343

Cited by 523