Performance of different KNN models in prediction english language readability


Altay O.

2nd International Conference on Computing and Machine Intelligence, ICMI 2022, İstanbul, Türkiye, 15 - 16 Temmuz 2022, (Tam Metin Bildiri) identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/icmi55296.2022.9873670
  • Basıldığı Şehir: İstanbul
  • Basıldığı Ülke: Türkiye
  • Anahtar Kelimeler: ensemble, random subspace, KNN, English language sentences readability.
  • Manisa Celal Bayar Üniversitesi Adresli: Evet

Özet

Assessing the readability of English, a universal language, is important in terms of meeting readers at different reading levels with texts at their own level. Presenting texts to readers at their own level will help them develop their learning, comprehension and reading capacities. In this study, a data set collected from BBC news was used to predict the readability of the English language. The data set consists of 17724 different sentences. Different k-nearest neighbor (KNN) models were used to predict the readability of English sentences. These models are basic KNN, two different weighted KNN and KNN base random subspace ensembles. KNN base random subspace ensemble has obtained superior results compared to other KNN models. KNN base random subspace ensemble accuracy was 0.9749 and f1-score 0.9692.