Instance reduction approach to machine learning and multi-database mining
Abstract
The paper proposes a heuristic instance reduction algorithm as an approach to machine learning and knowledge discovery in centralized and distributed databases. The proposed algorithm is based on an original method for a selection of reference instances and creates a reduced training dataset. The reduced training set consisting of selected instances can be used as an input for the machine learning algorithms used for data mining tasks. The algorithm calculates for each instance in the data set the value of its similarity coefficient. Values of the coefficient are used to group instances into clusters. The number of clusters depends on the value of the so called representation level set by the user. Out of each cluster only a limited number of instances is selected to form a reduced training set. The proposed algorithm uses population learning algorithm for selection of instances. The paper includes a description of the proposed approach and results of the validating experiment.
Full Text:
PDFDOI: http://dx.doi.org/10.17951/ai.2006.4.1.60-71
Date of publication: 2006-01-01 00:00:00
Date of submission: 2016-04-27 10:15:02
Statistics
Total abstract view - 385
Downloads (from 2020-06-17) - PDF - 0
Indicators
Refbacks
- There are currently no refbacks.
Copyright (c) 2015 Annales UMCS Sectio AI Informatica
This work is licensed under a Creative Commons Attribution 4.0 International License.