Skip navigation

Zastosuj identyfikator do podlinkowania lub zacytowania tej pozycji: http://hdl.handle.net/20.500.12128/23591
Tytuł: How the Outliers Influence the Quality of Clustering?
Autor: Nowak-Brzezińska, Agnieszka
Gaibei, Igor
Słowa kluczowe: clustering; outlier detection; clustering quality indexes; AHC; k-Means
Data wydania: 2022
Źródło: "Entropy" 2022, iss. 7, art. no. 917
Abstrakt: In this article, we evaluate the efficiency and performance of two clustering algorithms: AHC (Agglomerative Hierarchical Clustering) and K−Means. We are aware that there are various linkage options and distance measures that influence the clustering results. We assess the quality of clustering using the Davies–Bouldin and Dunn cluster validity indexes. The main contribution of this research is to verify whether the quality of clusters without outliers is higher than those with outliers in the data. To do this, we compare and analyze outlier detection algorithms depending on the applied clustering algorithm. In our research, we use and compare the LOF (Local Outlier Factor) and COF (Connectivity-based Outlier Factor) algorithms for detecting outliers before and after removing 1%, 5%, and 10% of outliers. Next, we analyze how the quality of clustering has improved. In the experiments, three real data sets were used with a different number of instances.
URI: http://hdl.handle.net/20.500.12128/23591
DOI: 10.3390/e24070917
ISSN: 1099-4300
Pojawia się w kolekcji:Artykuły (WNŚiT)

Pliki tej pozycji:
Plik Opis RozmiarFormat 
Nowak-Brzezinska_How_the_Outliers_Influence_the_Quality_of.pdf595,33 kBAdobe PDFPrzejrzyj / Otwórz
Pokaż pełny rekord


Uznanie Autorstwa 3.0 Polska Creative Commons Creative Commons