13
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UTHealth shared task Track 1.

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The 2014 i2b2/UTHealth Natural Language Processing (NLP) shared task featured four tracks. The first of these was the de-identification track focused on identifying protected health information (PHI) in longitudinal clinical narratives. The longitudinal nature of clinical narratives calls particular attention to details of information that, while benign on their own in separate records, can lead to identification of patients in combination in longitudinal records. Accordingly, the 2014 de-identification track addressed a broader set of entities and PHI than covered by the Health Insurance Portability and Accountability Act - the focus of the de-identification shared task that was organized in 2006. Ten teams tackled the 2014 de-identification task and submitted 22 system outputs for evaluation. Each team was evaluated on their best performing system output. Three of the 10 systems achieved F1 scores over .90, and seven of the top 10 scored over .75. The most successful systems combined conditional random fields and hand-written rules. Our findings indicate that automated systems can be very effective for this task, but that de-identification is not yet a solved problem.

          Related collections

          Author and article information

          Journal
          J Biomed Inform
          Journal of biomedical informatics
          Elsevier BV
          1532-0480
          1532-0464
          Dec 2015
          : 58 Suppl
          Affiliations
          [1 ] School of Library and Information Science, Simmons College, Boston, MA, USA. Electronic address: stubbs@simmons.edu.
          [2 ] Department of Information Studies, State University of New York at Albany, Albany, NY, USA.
          Article
          S1532-0464(15)00117-3 NIHMS806512
          10.1016/j.jbi.2015.06.007
          4989908
          26225918
          1b66ac4a-1c09-4b0c-b063-4c3e0efddc0b
          History

          Machine learning,Medical records,Natural language processing,Shared task

          Comments

          Comment on this article