The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool

Bibliographic Details
Title: The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool
Authors: Kuo, Tsung-Ting, Huh, Jina, Kim, Jihoon, El-Kareh, Robert, Singh, Siddharth, Feupe, Stephanie Feudjio, Kuri, Vincent, Lin, Gordon, Day, Michele E., Ohno-Machado, Lucila, Hsu, Chun-Nan
Publication Year: 2018
Collection: Computer Science
Subject Terms: Computer Science - Computation and Language
More Details: Objective. Annotation is expensive but essential for clinical note review and clinical natural language processing (cNLP). However, the extent to which computer-generated pre-annotation is beneficial to human annotation is still an open question. Our study introduces CLEAN (CLinical note rEview and ANnotation), a pre-annotation-based cNLP annotation system to improve clinical note annotation of data elements, and comprehensively compares CLEAN with the widely-used annotation system Brat Rapid Annotation Tool (BRAT). Materials and Methods. CLEAN includes an ensemble pipeline (CLEAN-EP) with a newly developed annotation tool (CLEAN-AT). A domain expert and a novice user/annotator participated in a comparative usability test by tagging 87 data elements related to Congestive Heart Failure (CHF) and Kawasaki Disease (KD) cohorts in 84 public notes. Results. CLEAN achieved higher note-level F1-score (0.896) over BRAT (0.820), with significant difference in correctness (P-value < 0.001), and the mostly related factor being system/software (P-value < 0.001). No significant difference (P-value 0.188) in annotation time was observed between CLEAN (7.262 minutes/note) and BRAT (8.286 minutes/note). The difference was mostly associated with note length (P-value < 0.001) and system/software (P-value 0.013). The expert reported CLEAN to be useful/satisfactory, while the novice reported slight improvements. Discussion. CLEAN improves the correctness of annotation and increases usefulness/satisfaction with the same level of efficiency. Limitations include untested impact of pre-annotation correctness rate, small sample size, small user size, and restrictedly validated gold standard. Conclusion. CLEAN with pre-annotation can be beneficial for an expert to deal with complex annotation tasks involving numerous and diverse target data elements.
Document Type: Working Paper
Access URL: http://arxiv.org/abs/1808.03806
Accession Number: edsarx.1808.03806
Database: arXiv
FullText Text:
  Availability: 0
CustomLinks:
  – Url: http://arxiv.org/abs/1808.03806
    Name: EDS - Arxiv
    Category: fullText
    Text: View this record from Arxiv
    MouseOverText: View this record from Arxiv
  – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edsarx&genre=article&issn=&ISBN=&volume=&issue=&date=20180811&spage=&pages=&title=The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool&atitle=The%20Impact%20of%20Automatic%20Pre-annotation%20in%20Clinical%20Note%20Data%20Element%20Extraction%20-%20the%20CLEAN%20Tool&aulast=Kuo%2C%20Tsung-Ting&id=DOI:
    Name: Full Text Finder (for New FTF UI) (s8985755)
    Category: fullText
    Text: Find It @ SCU Libraries
    MouseOverText: Find It @ SCU Libraries
Header DbId: edsarx
DbLabel: arXiv
An: edsarx.1808.03806
RelevancyScore: 981
AccessLevel: 3
PubType: Report
PubTypeId: report
PreciseRelevancyScore: 981.428771972656
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool
– Name: Author
  Label: Authors
  Group: Au
  Data: &lt;searchLink fieldCode=&quot;AR&quot; term=&quot;%22Kuo%2C+Tsung-Ting%22&quot;&gt;Kuo, Tsung-Ting&lt;/searchLink&gt;&lt;br /&gt;&lt;searchLink fieldCode=&quot;AR&quot; term=&quot;%22Huh%2C+Jina%22&quot;&gt;Huh, Jina&lt;/searchLink&gt;&lt;br /&gt;&lt;searchLink fieldCode=&quot;AR&quot; term=&quot;%22Kim%2C+Jihoon%22&quot;&gt;Kim, Jihoon&lt;/searchLink&gt;&lt;br /&gt;&lt;searchLink fieldCode=&quot;AR&quot; term=&quot;%22El-Kareh%2C+Robert%22&quot;&gt;El-Kareh, Robert&lt;/searchLink&gt;&lt;br /&gt;&lt;searchLink fieldCode=&quot;AR&quot; term=&quot;%22Singh%2C+Siddharth%22&quot;&gt;Singh, Siddharth&lt;/searchLink&gt;&lt;br /&gt;&lt;searchLink fieldCode=&quot;AR&quot; term=&quot;%22Feupe%2C+Stephanie+Feudjio%22&quot;&gt;Feupe, Stephanie Feudjio&lt;/searchLink&gt;&lt;br /&gt;&lt;searchLink fieldCode=&quot;AR&quot; term=&quot;%22Kuri%2C+Vincent%22&quot;&gt;Kuri, Vincent&lt;/searchLink&gt;&lt;br /&gt;&lt;searchLink fieldCode=&quot;AR&quot; term=&quot;%22Lin%2C+Gordon%22&quot;&gt;Lin, Gordon&lt;/searchLink&gt;&lt;br /&gt;&lt;searchLink fieldCode=&quot;AR&quot; term=&quot;%22Day%2C+Michele+E%2E%22&quot;&gt;Day, Michele E.&lt;/searchLink&gt;&lt;br /&gt;&lt;searchLink fieldCode=&quot;AR&quot; term=&quot;%22Ohno-Machado%2C+Lucila%22&quot;&gt;Ohno-Machado, Lucila&lt;/searchLink&gt;&lt;br /&gt;&lt;searchLink fieldCode=&quot;AR&quot; term=&quot;%22Hsu%2C+Chun-Nan%22&quot;&gt;Hsu, Chun-Nan&lt;/searchLink&gt;
– Name: DatePubCY
  Label: Publication Year
  Group: Date
  Data: 2018
– Name: Subset
  Label: Collection
  Group: HoldingsInfo
  Data: Computer Science
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: &lt;searchLink fieldCode=&quot;DE&quot; term=&quot;%22Computer+Science+-+Computation+and+Language%22&quot;&gt;Computer Science - Computation and Language&lt;/searchLink&gt;
– Name: Abstract
  Label: Description
  Group: Ab
  Data: Objective. Annotation is expensive but essential for clinical note review and clinical natural language processing (cNLP). However, the extent to which computer-generated pre-annotation is beneficial to human annotation is still an open question. Our study introduces CLEAN (CLinical note rEview and ANnotation), a pre-annotation-based cNLP annotation system to improve clinical note annotation of data elements, and comprehensively compares CLEAN with the widely-used annotation system Brat Rapid Annotation Tool (BRAT). Materials and Methods. CLEAN includes an ensemble pipeline (CLEAN-EP) with a newly developed annotation tool (CLEAN-AT). A domain expert and a novice user/annotator participated in a comparative usability test by tagging 87 data elements related to Congestive Heart Failure (CHF) and Kawasaki Disease (KD) cohorts in 84 public notes. Results. CLEAN achieved higher note-level F1-score (0.896) over BRAT (0.820), with significant difference in correctness (P-value &lt; 0.001), and the mostly related factor being system/software (P-value &lt; 0.001). No significant difference (P-value 0.188) in annotation time was observed between CLEAN (7.262 minutes/note) and BRAT (8.286 minutes/note). The difference was mostly associated with note length (P-value &lt; 0.001) and system/software (P-value 0.013). The expert reported CLEAN to be useful/satisfactory, while the novice reported slight improvements. Discussion. CLEAN improves the correctness of annotation and increases usefulness/satisfaction with the same level of efficiency. Limitations include untested impact of pre-annotation correctness rate, small sample size, small user size, and restrictedly validated gold standard. Conclusion. CLEAN with pre-annotation can be beneficial for an expert to deal with complex annotation tasks involving numerous and diverse target data elements.
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Working Paper
– Name: URL
  Label: Access URL
  Group: URL
  Data: &lt;link linkTarget=&quot;URL&quot; linkTerm=&quot;http://arxiv.org/abs/1808.03806&quot; linkWindow=&quot;_blank&quot;&gt;http://arxiv.org/abs/1808.03806&lt;/link&gt;
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edsarx.1808.03806
PLink https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.1808.03806
RecordInfo BibRecord:
  BibEntity:
    Subjects:
      – SubjectFull: Computer Science - Computation and Language
        Type: general
    Titles:
      – TitleFull: The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Kuo, Tsung-Ting
      – PersonEntity:
          Name:
            NameFull: Huh, Jina
      – PersonEntity:
          Name:
            NameFull: Kim, Jihoon
      – PersonEntity:
          Name:
            NameFull: El-Kareh, Robert
      – PersonEntity:
          Name:
            NameFull: Singh, Siddharth
      – PersonEntity:
          Name:
            NameFull: Feupe, Stephanie Feudjio
      – PersonEntity:
          Name:
            NameFull: Kuri, Vincent
      – PersonEntity:
          Name:
            NameFull: Lin, Gordon
      – PersonEntity:
          Name:
            NameFull: Day, Michele E.
      – PersonEntity:
          Name:
            NameFull: Ohno-Machado, Lucila
      – PersonEntity:
          Name:
            NameFull: Hsu, Chun-Nan
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 11
              M: 08
              Type: published
              Y: 2018
ResultId 1