119
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A hybrid human and machine resource curation pipeline for the Neuroscience Information Framework

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The breadth of information resources available to researchers on the Internet continues to expand, particularly in light of recently implemented data-sharing policies required by funding agencies. However, the nature of dense, multifaceted neuroscience data and the design of contemporary search engine systems makes efficient, reliable and relevant discovery of such information a significant challenge. This challenge is specifically pertinent for online databases, whose dynamic content is ‘hidden’ from search engines. The Neuroscience Information Framework (NIF; http://www.neuinfo.org) was funded by the NIH Blueprint for Neuroscience Research to address the problem of finding and utilizing neuroscience-relevant resources such as software tools, data sets, experimental animals and antibodies across the Internet. From the outset, NIF sought to provide an accounting of available resources, whereas developing technical solutions to finding, accessing and utilizing them. The curators therefore, are tasked with identifying and registering resources, examining data, writing configuration files to index and display data and keeping the contents current. In the initial phases of the project, all aspects of the registration and curation processes were manual. However, as the number of resources grew, manual curation became impractical. This report describes our experiences and successes with developing automated resource discovery and semiautomated type characterization with text-mining scripts that facilitate curation team efforts to discover, integrate and display new content. We also describe the DISCO framework, a suite of automated web services that significantly reduce manual curation efforts to periodically check for resource updates. Lastly, we discuss DOMEO, a semi-automated annotation tool that improves the discovery and curation of resources that are not necessarily website-based (i.e. reagents, software tools). Although the ultimate goal of automation was to reduce the workload of the curators, it has resulted in valuable analytic by-products that address accessibility, use and citation of resources that can now be shared with resource owners and the larger scientific community.

          Database URL: http://neuinfo.org

          Related collections

          Most cited references15

          • Record: found
          • Abstract: found
          • Article: not found

          The neuroscience information framework: a data and knowledge environment for neuroscience.

          With support from the Institutes and Centers forming the NIH Blueprint for Neuroscience Research, we have designed and implemented a new initiative for integrating access to and use of Web-based neuroscience resources: the Neuroscience Information Framework. The Framework arises from the expressed need of the neuroscience community for neuroinformatic tools and resources to aid scientific inquiry, builds upon prior development of neuroinformatics by the Human Brain Project and others, and directly derives from the Society for Neuroscience's Neuroscience Database Gateway. Partnered with the Society, its Neuroinformatics Committee, and volunteer consultant-collaborators, our multi-site consortium has developed: (1) a comprehensive, dynamic, inventory of Web-accessible neuroscience resources, (2) an extended and integrated terminology describing resources and contents, and (3) a framework accepting and aiding concept-based queries. Evolving instantiations of the Framework may be viewed at http://nif.nih.gov , http://neurogateway.org , and other sites as they come on line.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Challenges and opportunities in mining neuroscience data.

            Understanding the brain requires a broad range of approaches and methods from the domains of biology, psychology, chemistry, physics, and mathematics. The fundamental challenge is to decipher the "neural choreography" associated with complex behaviors and functions, including thoughts, memories, actions, and emotions. This demands the acquisition and integration of vast amounts of data of many types, at multiple scales in time and in space. Here we discuss the need for neuroinformatics approaches to accelerate progress, using several illustrative examples. The nascent field of "connectomics" aims to comprehensively describe neuronal connectivity at either a macroscopic level (in long-distance pathways for the entire brain) or a microscopic level (among axons, dendrites, and synapses in a small brain region). The Neuroscience Information Framework (NIF) encompasses all of neuroscience and facilitates the integration of existing knowledge and databases of many types. These examples illustrate the opportunities and challenges of data mining across multiple tiers of neuroscience information and underscore the need for cultural and infrastructure changes if neuroinformatics is to fulfill its potential to advance our understanding of the brain.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The NIFSTD and BIRNLex vocabularies: building comprehensive ontologies for neuroscience.

              A critical component of the Neuroscience Information Framework (NIF) project is a consistent, flexible terminology for describing and retrieving neuroscience-relevant resources. Although the original NIF specification called for a loosely structured controlled vocabulary for describing neuroscience resources, as the NIF system evolved, the requirement for a formally structured ontology for neuroscience with sufficient granularity to describe and access a diverse collection of information became obvious. This requirement led to the NIF standardized (NIFSTD) ontology, a comprehensive collection of common neuroscience domain terminologies woven into an ontologically consistent, unified representation of the biomedical domains typically used to describe neuroscience data (e.g., anatomy, cell types, techniques), as well as digital resources (tools, databases) being created throughout the neuroscience community. NIFSTD builds upon a structure established by the BIRNLex, a lexicon of concepts covering clinical neuroimaging research developed by the Biomedical Informatics Research Network (BIRN) project. Each distinct domain module is represented using the Web Ontology Language (OWL). As much as has been practical, NIFSTD reuses existing community ontologies that cover the required biomedical domains, building the more specific concepts required to annotate NIF resources. By following this principle, an extensive vocabulary was assembled in a relatively short period of time for NIF information annotation, organization, and retrieval, in a form that promotes easy extension and modification. We report here on the structure of the NIFSTD, and its predecessor BIRNLex, the principles followed in its construction and provide examples of its use within NIF.
                Bookmark

                Author and article information

                Journal
                Database (Oxford)
                databa
                databa
                Database: The Journal of Biological Databases and Curation
                Oxford University Press
                1758-0463
                2012
                13 February 2012
                13 February 2012
                : 2012
                : bas005
                Affiliations
                1Center for Research in Biological Systems, University of California San Diego, 2Division of Biology, California Institute of Technology, Pasadena, CA 91125, 3Massachusetts General Hospital and Harvard Medical School and 4Center for Medical Informatics, Yale University School of Medicine
                Author notes
                *Corresponding author: Tel: +858 822 3629; Email: abandrowski@ 123456ucsd.edu
                Article
                bas005
                10.1093/database/bas005
                3308161
                22434839
                f87fff30-a5be-4938-8d30-5d86f4c0b19e
                © The Author(s) 2012. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 19 October 2011
                : 6 January 2012
                : 9 January 2012
                Page count
                Pages: 11
                Categories
                Original Articles

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article