9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      The Opportunities and Shortcomings of Using Big Data and National Databases for Sarcoma Research

      research-article
      1 , 1 , 2 , 1 , 3
      Cancer
      Sarcoma, database, big data, NCDB, SEER

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The rarity and heterogeneity of sarcomas makes performing appropriately powered studies challenging and magnifies the significance of large databases in sarcoma research. Established large tumor registries and population-based databases have become increasingly more relevant to answer clinical questions regarding sarcoma incidence, treatment patterns, and outcomes. However, the validity of large databases has been questioned and scrutinized due to inaccuracy and wide variability of coding practices and absence of clinically relevant variables. Additionally, the utilization of large databases for the study of rare cancers like sarcoma may be particularly challenging secondary to known limitations of administrative data and poor overall data quality. Currently there are several large national cancer databases including the Surveillance, Epidemiology, and End Results (SEER) database, the American College of Surgeons’ and American Cancer Society’s National Cancer Database (NCDB), and the Center for Disease Control (CDC) National Program of Cancer Registries (NPCR). These are often used for sarcoma research but these databases are limited by a dependence on administrative or billing data, the lack of agreement between chart abstractors on diagnosis codes, and the use of preexisting documented hospital diagnosis codes for tumor registries leading to significant underestimation of sarcomas in large datasets. Current and future initiatives to improve databases and big data applications for sarcoma research include increasing the utilization of sarcoma-specific registries and encouraging national initiatives to expand on real-world evidence based datasets.

          Precis:

          The main aim of this article is to demonstrate the limitations of these databases specifically for sarcoma research. We also describe current initiatives formed to improve the application of big data for rare malignancies.

          Related collections

          Author and article information

          Journal
          0374236
          2771
          Cancer
          Cancer
          Cancer
          0008-543X
          1097-0142
          11 April 2019
          15 May 2019
          01 September 2019
          01 September 2020
          : 125
          : 17
          : 2926-2934
          Affiliations
          [1 ]Department of Surgery, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115
          [2 ]Department of Emergency Medicine, Brigham and Women’s Hospital, Harvard Medical School Boston, MA, 02115
          [3 ]Center for Sarcoma and Bone Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02115
          Author notes
          Corresponding Author: Heather Lyu, MD, Brigham and Women’s Hospital, Department of Surgery, 75 Francis St, Boston, MA 02115, Phone: 703-965-9392, hlyu@ 123456bwh.harvard.edu
          Author information
          http://orcid.org/0000-0001-7759-0799
          Article
          PMC6690764 PMC6690764 6690764 nihpa1018980
          10.1002/cncr.32118
          6690764
          31090929
          89df7aaa-3ddf-49ca-96c8-8e27036d5b4c
          History
          Categories
          Article

          database,big data,Sarcoma,SEER,NCDB
          database, big data, Sarcoma, SEER, NCDB

          Comments

          Comment on this article