9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Fast and accurate protein structure search with Foldseek

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          As structure prediction methods are generating millions of publicly available protein structures, searching these databases is becoming a bottleneck. Foldseek aligns the structure of a query protein against a database by describing tertiary amino acid interactions within proteins as sequences over a structural alphabet. Foldseek decreases computation times by four to five orders of magnitude with 86%, 88% and 133% of the sensitivities of Dali, TM-align and CE, respectively.

          Abstract

          Foldseek speeds up protein structural search by four to five orders of magnitude.

          Related collections

          Most cited references45

          • Record: found
          • Abstract: found
          • Article: not found

          Basic local alignment search tool.

          A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Highly accurate protein structure prediction with AlphaFold

            Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models

              The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk ) is an openly accessible, extensive database of high-accuracy protein-structure predictions. Powered by AlphaFold v2.0 of DeepMind, it has enabled an unprecedented expansion of the structural coverage of the known protein-sequence space. AlphaFold DB provides programmatic access to and interactive visualization of predicted atomic coordinates, per-residue and pairwise model-confidence estimates and predicted aligned errors. The initial release of AlphaFold DB contains over 360,000 predicted structures across 21 model-organism proteomes, which will soon be expanded to cover most of the (over 100 million) representative sequences from the UniRef90 data set.
                Bookmark

                Author and article information

                Contributors
                soeding@mpinat.mpg.de
                martin.steinegger@snu.ac.kr
                Journal
                Nat Biotechnol
                Nat Biotechnol
                Nature Biotechnology
                Nature Publishing Group US (New York )
                1087-0156
                1546-1696
                8 May 2023
                8 May 2023
                2024
                : 42
                : 2
                : 243-246
                Affiliations
                [1 ]Quantitative and Computational Biology Group, Max Planck Institute for Multidisciplinary Sciences, ( https://ror.org/03av75f26) Göttingen, Germany
                [2 ]School of Biological Sciences, Seoul National University, ( https://ror.org/04h9pn542) Seoul, South Korea
                [3 ]Campus Institute Data Science (CIDAS), Göttingen, Germany
                [4 ]Artificial Intelligence Institute, Seoul National University, ( https://ror.org/04h9pn542) Seoul, South Korea
                [5 ]Institute of Molecular Biology and Genetics, Seoul National University, ( https://ror.org/04h9pn542) Seoul, South Korea
                Author information
                http://orcid.org/0000-0002-2720-5714
                http://orcid.org/0000-0001-8637-6719
                http://orcid.org/0000-0002-5732-3009
                http://orcid.org/0000-0001-9642-8244
                http://orcid.org/0000-0001-8781-9753
                Article
                1773
                10.1038/s41587-023-01773-0
                10869269
                37156916
                7737e872-fe24-44a7-9f7c-7c21e1b8240b
                © The Author(s) 2023

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 17 February 2022
                : 30 March 2023
                Funding
                Funded by: FundRef https://doi.org/10.13039/501100003725, National Research Foundation of Korea (NRF);
                Award ID: 2019R1-A6A1-A10073437
                Award ID: 2019R1A6A1A10073437, 2020M3A9G7103933, 2021R1C1C102065, 2021M3A9I4021220
                Award Recipient :
                Funded by: German ministry for education and research (BMBF) (horizontal4meta)
                Funded by: Samsung DS research fund, Creative-Pioneering Researchers Program through Seoul National University
                Categories
                Brief Communication
                Custom metadata
                © Springer Nature America, Inc. 2024

                Biotechnology
                computational biology and bioinformatics,structural biology,software
                Biotechnology
                computational biology and bioinformatics, structural biology, software

                Comments

                Comment on this article