Averaging and boosting methods in ensemble-based classifiers for text readability

Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12128/21729

Title:	Averaging and boosting methods in ensemble-based classifiers for text readability
Authors:	Korniichuk, Ruslan Boryczka, Mariusz
Keywords:	Ensemble Methods; Averaging Methods; Boosting Methods; Classification; Explainable Prediction; Readability Indices
Issue Date:	2021
Citation:	"Procedia Computer Science" Vol. 192 (2021), s. 3677-3685
Abstract:	The purpose of this paper is to investigate whether it is possible to predict text readability with ensemble-based classifiers. In this article, the authors calculated and analyzed the readability indices. In the next stage, they defined additional features for each text and determined the relationships between readability and features. Among the various tasks of machine learning, they chose the classification problem. The authors calculated and compared the accuracy of different machine learning models. After building the models, they proceeded to the Random decision forests model interpretation step using the SHAP method. The authors show that machine learning models based on only three features are capable of predicting text readability. Long sentences and a low percentage of stop words can cause low readability. The machine learning model shown in this paper allows to classify texts according to readability with a model accuracy of 0.9.
URI:	http://hdl.handle.net/20.500.12128/21729
DOI:	10.1016/j.procs.2021.09.141
ISSN:	1877-0509
Appears in Collections:	Artykuły (WNŚiT)

Files in This Item:

File	Description	Size	Format
Korniichuk_Averaging_and_boosting_methods_in_ensemble-based_classifiers.pdf		931,43 kB	Adobe PDF	View/Open

Uznanie autorstwa - użycie niekomercyjne, bez utworów zależnych 3.0 Polska Creative Commons License