Skip navigation

Zastosuj identyfikator do podlinkowania lub zacytowania tej pozycji: http://hdl.handle.net/20.500.12128/20842
Tytuł: Data irregularities in discretisation of test sets used for evaluation of classification systems: A case study on authorship attribution
Autor: Stańczyk, Urszula
Zielosko, Beata
Słowa kluczowe: discretisation; data irregularities; evaluation and test sets; rough sets; authorship attribution; stylometry
Data wydania: 2021
Źródło: "Bulletin of the Polish Academy of Sciences. Technical Sciences" (2021), Vol. 69, art. no. e137629
Abstrakt: When patterns to be recognised are described by features of continuous type, discretisation becomes either an optional or necessary step in the initial data pre-processing stage. Characteristics of data, distribution of data points in the input space, can significantly influence the process of transformation from real-valued into nominal attributes, and the resulting performance of classification systems employing them. If data include several separate sets, their discretisation becomes more complex, as varying numbers of intervals and different ranges can be constructed for the same variables. The paper presents research on irregularities in data distribution, observed in the context of discretisation processes. Selected discretisation methods were used and their effect on the performance of decision algorithms, induced in classical rough set approach, was investigated. The studied input space was defined by measurable style-markers, which, exploited as characteristic features, allow to treat a task of stylometric authorship attribution as classification.
URI: http://hdl.handle.net/20.500.12128/20842
DOI: 10.24425/bpasts.2021.137629
ISSN: 2300-1917
Pojawia się w kolekcji:Artykuły (WNŚiT)

Pliki tej pozycji:
Plik Opis RozmiarFormat 
Stanczyk_Zielosko_data_irregularities_in_discretisation.pdf7,07 MBAdobe PDFPrzejrzyj / Otwórz
Pokaż pełny rekord


Uznanie autorstwa - użycie niekomercyjne, bez utworów zależnych 3.0 Polska Creative Commons Creative Commons