Bibliographic Details
Title: |
A method for optimizing text preprocessing and text classification using multiple cycles of learning with an application on shipbrokers emails. |
Authors: |
Papageorgiou, Grigorios1 (AUTHOR), Economou, Polychronis1 (AUTHOR) peconom@upatras.gr, Bersimis, Sotirios2 (AUTHOR) |
Source: |
Journal of Applied Statistics. Oct2024, Vol. 51 Issue 13, p2592-2626. 35p. |
Subject Terms: |
Classification algorithms, Machine performance, Machine learning, Classification, Acronyms |
Abstract: |
Optimizing text preprocessing and text classification algorithms is an important, everyday task in large organizations and companies and it usually involves a labor-intensive and time-consuming effort. For example, the filtering and sorting of a large number of electronic mails (emails) are crucial to keeping track of the received information and converting it automatically into useful and profitable knowledge. Business emails are often unstructured, noisy, and with many abbreviations and acronyms, which makes their handling a challenging procedure. To overcome those challenges, a two-step classification approach is proposed, along with a two-cycle labeling procedure in order to speed up the labeling process. Every step incorporates a heuristic classification approach to assign emails to predefined classes by comparing several classification and text vectorization algorithms. These algorithms are compared and evaluated using the F1 score and balanced accuracy. The implementation of the proposed algorithm is demonstrated in a shipbroker agent operating in Greece with excellent performance, improving organization and administration while reducing expenses. [ABSTRACT FROM AUTHOR] |
|
Copyright of Journal of Applied Statistics is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
Database: |
Business Source Complete |
Full text is not displayed to guests. |
Login for full access.
|