The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool

Bibliographic Details
Title:	The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool
Authors:	Kuo, Tsung-Ting, Huh, Jina, Kim, Jihoon, El-Kareh, Robert, Singh, Siddharth, Feupe, Stephanie Feudjio, Kuri, Vincent, Lin, Gordon, Day, Michele E., Ohno-Machado, Lucila, Hsu, Chun-Nan
Publication Year:	2018
Collection:	Computer Science
Subject Terms:	Computer Science - Computation and Language
More Details:	Objective. Annotation is expensive but essential for clinical note review and clinical natural language processing (cNLP). However, the extent to which computer-generated pre-annotation is beneficial to human annotation is still an open question. Our study introduces CLEAN (CLinical note rEview and ANnotation), a pre-annotation-based cNLP annotation system to improve clinical note annotation of data elements, and comprehensively compares CLEAN with the widely-used annotation system Brat Rapid Annotation Tool (BRAT). Materials and Methods. CLEAN includes an ensemble pipeline (CLEAN-EP) with a newly developed annotation tool (CLEAN-AT). A domain expert and a novice user/annotator participated in a comparative usability test by tagging 87 data elements related to Congestive Heart Failure (CHF) and Kawasaki Disease (KD) cohorts in 84 public notes. Results. CLEAN achieved higher note-level F1-score (0.896) over BRAT (0.820), with significant difference in correctness (P-value < 0.001), and the mostly related factor being system/software (P-value < 0.001). No significant difference (P-value 0.188) in annotation time was observed between CLEAN (7.262 minutes/note) and BRAT (8.286 minutes/note). The difference was mostly associated with note length (P-value < 0.001) and system/software (P-value 0.013). The expert reported CLEAN to be useful/satisfactory, while the novice reported slight improvements. Discussion. CLEAN improves the correctness of annotation and increases usefulness/satisfaction with the same level of efficiency. Limitations include untested impact of pre-annotation correctness rate, small sample size, small user size, and restrictedly validated gold standard. Conclusion. CLEAN with pre-annotation can be beneficial for an expert to deal with complex annotation tasks involving numerous and diverse target data elements.
Document Type:	Working Paper
Access URL:	http://arxiv.org/abs/1808.03806
Accession Number:	edsarx.1808.03806
Database:	arXiv

FullText	Text: Availability: 0 CustomLinks: – Url: http://arxiv.org/abs/1808.03806 Name: EDS - Arxiv Category: fullText Text: View this record from Arxiv MouseOverText: View this record from Arxiv – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edsarx&genre=article&issn=&ISBN=&volume=&issue=&date=20180811&spage=&pages=&title=The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool&atitle=The%20Impact%20of%20Automatic%20Pre-annotation%20in%20Clinical%20Note%20Data%20Element%20Extraction%20-%20the%20CLEAN%20Tool&aulast=Kuo%2C%20Tsung-Ting&id=DOI: Name: Full Text Finder (for New FTF UI) (s8985755) Category: fullText Text: Find It @ SCU Libraries MouseOverText: Find It @ SCU Libraries
Header	DbId: edsarx DbLabel: arXiv An: edsarx.1808.03806 RelevancyScore: 981 AccessLevel: 3 PubType: Report PubTypeId: report PreciseRelevancyScore: 981.428771972656
IllustrationInfo
Items	– Name: Title Label: Title Group: Ti Data: The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Kuo%2C+Tsung-Ting%22">Kuo, Tsung-Ting</searchLink><br /><searchLink fieldCode="AR" term="%22Huh%2C+Jina%22">Huh, Jina</searchLink><br /><searchLink fieldCode="AR" term="%22Kim%2C+Jihoon%22">Kim, Jihoon</searchLink><br /><searchLink fieldCode="AR" term="%22El-Kareh%2C+Robert%22">El-Kareh, Robert</searchLink><br /><searchLink fieldCode="AR" term="%22Singh%2C+Siddharth%22">Singh, Siddharth</searchLink><br /><searchLink fieldCode="AR" term="%22Feupe%2C+Stephanie+Feudjio%22">Feupe, Stephanie Feudjio</searchLink><br /><searchLink fieldCode="AR" term="%22Kuri%2C+Vincent%22">Kuri, Vincent</searchLink><br /><searchLink fieldCode="AR" term="%22Lin%2C+Gordon%22">Lin, Gordon</searchLink><br /><searchLink fieldCode="AR" term="%22Day%2C+Michele+E%2E%22">Day, Michele E.</searchLink><br /><searchLink fieldCode="AR" term="%22Ohno-Machado%2C+Lucila%22">Ohno-Machado, Lucila</searchLink><br /><searchLink fieldCode="AR" term="%22Hsu%2C+Chun-Nan%22">Hsu, Chun-Nan</searchLink> – Name: DatePubCY Label: Publication Year Group: Date Data: 2018 – Name: Subset Label: Collection Group: HoldingsInfo Data: Computer Science – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22Computer+Science+-+Computation+and+Language%22">Computer Science - Computation and Language</searchLink> – Name: Abstract Label: Description Group: Ab Data: Objective. Annotation is expensive but essential for clinical note review and clinical natural language processing (cNLP). However, the extent to which computer-generated pre-annotation is beneficial to human annotation is still an open question. Our study introduces CLEAN (CLinical note rEview and ANnotation), a pre-annotation-based cNLP annotation system to improve clinical note annotation of data elements, and comprehensively compares CLEAN with the widely-used annotation system Brat Rapid Annotation Tool (BRAT). Materials and Methods. CLEAN includes an ensemble pipeline (CLEAN-EP) with a newly developed annotation tool (CLEAN-AT). A domain expert and a novice user/annotator participated in a comparative usability test by tagging 87 data elements related to Congestive Heart Failure (CHF) and Kawasaki Disease (KD) cohorts in 84 public notes. Results. CLEAN achieved higher note-level F1-score (0.896) over BRAT (0.820), with significant difference in correctness (P-value < 0.001), and the mostly related factor being system/software (P-value < 0.001). No significant difference (P-value 0.188) in annotation time was observed between CLEAN (7.262 minutes/note) and BRAT (8.286 minutes/note). The difference was mostly associated with note length (P-value < 0.001) and system/software (P-value 0.013). The expert reported CLEAN to be useful/satisfactory, while the novice reported slight improvements. Discussion. CLEAN improves the correctness of annotation and increases usefulness/satisfaction with the same level of efficiency. Limitations include untested impact of pre-annotation correctness rate, small sample size, small user size, and restrictedly validated gold standard. Conclusion. CLEAN with pre-annotation can be beneficial for an expert to deal with complex annotation tasks involving numerous and diverse target data elements. – Name: TypeDocument Label: Document Type Group: TypDoc Data: Working Paper – Name: URL Label: Access URL Group: URL Data: <link linkTarget="URL" linkTerm="http://arxiv.org/abs/1808.03806" linkWindow="_blank">http://arxiv.org/abs/1808.03806</link> – Name: AN Label: Accession Number Group: ID Data: edsarx.1808.03806
PLink	https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.1808.03806
RecordInfo	BibRecord: BibEntity: Subjects: – SubjectFull: Computer Science - Computation and Language Type: general Titles: – TitleFull: The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Kuo, Tsung-Ting – PersonEntity: Name: NameFull: Huh, Jina – PersonEntity: Name: NameFull: Kim, Jihoon – PersonEntity: Name: NameFull: El-Kareh, Robert – PersonEntity: Name: NameFull: Singh, Siddharth – PersonEntity: Name: NameFull: Feupe, Stephanie Feudjio – PersonEntity: Name: NameFull: Kuri, Vincent – PersonEntity: Name: NameFull: Lin, Gordon – PersonEntity: Name: NameFull: Day, Michele E. – PersonEntity: Name: NameFull: Ohno-Machado, Lucila – PersonEntity: Name: NameFull: Hsu, Chun-Nan IsPartOfRelationships: – BibEntity: Dates: – D: 11 M: 08 Type: published Y: 2018
ResultId	1