3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      mlf-core: a framework for deterministic machine learning

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Machine learning has shown extensive growth in recent years and is now routinely applied to sensitive areas. To allow appropriate verification of predictive models before deployment, models must be deterministic. Solely fixing all random seeds is not sufficient for deterministic machine learning, as major machine learning libraries default to the usage of nondeterministic algorithms based on atomic operations.

          Results

          Various machine learning libraries released deterministic counterparts to the nondeterministic algorithms. We evaluated the effect of these algorithms on determinism and runtime. Based on these results, we formulated a set of requirements for deterministic machine learning and developed a new software solution, the mlf-core ecosystem, which aids machine learning projects to meet and keep these requirements. We applied mlf-core to develop deterministic models in various biomedical fields including a single-cell autoencoder with TensorFlow, a PyTorch-based U-Net model for liver-tumor segmentation in computed tomography scans, and a liver cancer classifier based on gene expression profiles with XGBoost.

          Availability and implementation

          The complete data together with the implementations of the mlf-core ecosystem and use case models are available at https://github.com/mlf-core.

          Related collections

          Most cited references36

          • Record: found
          • Abstract: found
          • Article: not found

          New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1).

          Assessment of the change in tumour burden is an important feature of the clinical evaluation of cancer therapeutics: both tumour shrinkage (objective response) and disease progression are useful endpoints in clinical trials. Since RECIST was published in 2000, many investigators, cooperative groups, industry and government authorities have adopted these criteria in the assessment of treatment outcomes. However, a number of questions and issues have arisen which have led to the development of a revised RECIST guideline (version 1.1). Evidence for changes, summarised in separate papers in this special issue, has come from assessment of a large data warehouse (>6500 patients), simulation studies and literature reviews. HIGHLIGHTS OF REVISED RECIST 1.1: Major changes include: Number of lesions to be assessed: based on evidence from numerous trial databases merged into a data warehouse for analysis purposes, the number of lesions required to assess tumour burden for response determination has been reduced from a maximum of 10 to a maximum of five total (and from five to two per organ, maximum). Assessment of pathological lymph nodes is now incorporated: nodes with a short axis of 15 mm are considered measurable and assessable as target lesions. The short axis measurement should be included in the sum of lesions in calculation of tumour response. Nodes that shrink to <10mm short axis are considered normal. Confirmation of response is required for trials with response primary endpoint but is no longer required in randomised studies since the control arm serves as appropriate means of interpretation of data. Disease progression is clarified in several aspects: in addition to the previous definition of progression in target disease of 20% increase in sum, a 5mm absolute increase is now required as well to guard against over calling PD when the total sum is very small. Furthermore, there is guidance offered on what constitutes 'unequivocal progression' of non-measurable/non-target disease, a source of confusion in the original RECIST guideline. Finally, a section on detection of new lesions, including the interpretation of FDG-PET scan assessment is included. Imaging guidance: the revised RECIST includes a new imaging appendix with updated recommendations on the optimal anatomical assessment of lesions. A key question considered by the RECIST Working Group in developing RECIST 1.1 was whether it was appropriate to move from anatomic unidimensional assessment of tumour burden to either volumetric anatomical assessment or to functional assessment with PET or MRI. It was concluded that, at present, there is not sufficient standardisation or evidence to abandon anatomical assessment of tumour burden. The only exception to this is in the use of FDG-PET imaging as an adjunct to determination of progression. As is detailed in the final paper in this special issue, the use of these promising newer approaches requires appropriate clinical validation studies.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            The FAIR Guiding Principles for scientific data management and stewardship

            There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              The nf-core framework for community-curated bioinformatics pipelines

                Bookmark

                Author and article information

                Contributors
                Role: Associate Editor
                Journal
                Bioinformatics
                Bioinformatics
                bioinformatics
                Bioinformatics
                Oxford University Press
                1367-4803
                1367-4811
                April 2023
                02 April 2023
                02 April 2023
                : 39
                : 4
                : btad164
                Affiliations
                Quantitative Biology Center (QBiC), Eberhard Karls University of Tübingen , Tübingen 72076, Germany
                Institute of Computational Biology, Helmholtz Zentrum München , Munich 85764, Germany
                Institute of Lung Biology and Disease and Comprehensive Pneumology Center, Helmholtz Zentrum München, Member of the German Center for Lung Research (DZL) , Munich 81377, Germany
                TUM School of Life Sciences Weihenstephan, Technical University of Munich , Freising 85354, Germany
                Department of Informatics, University of Hamburg , Hamburg 20146, Germany
                Quantitative Biology Center (QBiC), Eberhard Karls University of Tübingen , Tübingen 72076, Germany
                Quantitative Biology Center (QBiC), Eberhard Karls University of Tübingen , Tübingen 72076, Germany
                Department of Biological Sciences and Center for Systems Biology, The University of Texas at Dallas , Richardson, TX 75205, United States
                Quantitative Biology Center (QBiC), Eberhard Karls University of Tübingen , Tübingen 72076, Germany
                Quantitative Biology Center (QBiC), Eberhard Karls University of Tübingen , Tübingen 72076, Germany
                Quantitative Biology Center (QBiC), Eberhard Karls University of Tübingen , Tübingen 72076, Germany
                Biomedical Data Science, Department for Computer Science, Eberhard Karls University of Tübingen , Tübingen 72074, Germany
                Institute of Bioinformatics and Medical Informatics, Eberhard Karls University of Tübingen , Tübingen 72074, Germany
                Faculty of Medicine, Eberhard Karls University of Tübingen , Tübingen 72016, Germany
                Author notes
                Corresponding author. Quantitative Biology Center (QBiC), Eberhard Karls University of Tübingen, Auf der Morgenstelle 10, Tübingen 72076, Germany. E-mail: lukas.heumos@ 123456helmholtz-munich.de (L.H.), gisela.gabernet@ 123456gmail.com (G.G.), sven.nahnsen@ 123456qbic.uni-tuebingen.de (S.N.)

                Gisela Gabernet and Sven Nahnsen contributed equally and share the senior authorship.

                Author information
                https://orcid.org/0000-0002-6950-6929
                https://orcid.org/0000-0001-7049-9474
                Article
                btad164
                10.1093/bioinformatics/btad164
                10089676
                37004171
                b180c785-5124-4678-be5c-28b65d40f87d
                © The Author(s) 2023. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 10 August 2022
                : 24 January 2023
                : 27 February 2023
                : 27 February 2023
                : 11 April 2023
                Page count
                Pages: 8
                Funding
                Funded by: Deutsche Forschungs Gemeinschaft;
                Award ID: 398967434
                Categories
                Original Paper
                Data and Text Mining
                AcademicSubjects/SCI01060

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article