Project Details - Machine Learning network Online Information Service

MLnetOiS Logo left

MLnetOiS Logo right

UKDD

Uncertainty handling in the data mining process

Add a project to the database.

Update the entry for this project.


	Title (abbrev)	Title (full)		Last update
	UKDD	Uncertainty handling in the data mining process		b D, Y
	Category	WWW
	Knowledge Discovery
	Description
	The KDD process aims at searching for interesting instances of patterns in data sets. It is widely accepted that the patterns must be valid and ultimately comprehensible (i.e they should be understood by the analysts). Another important aspect that is under-addressed in the KDD process is the handling of uncertainty in the process of clustering, classification and association rules extraction. In the vast majority of KDD systems and approaches the data values are classified to one of a set of categories that have resulted from a clustering process. Thus there is "usefull" knowledge that is partially extracted or is not extracted at all during the KDD process. In our work we propose a classification and relationship extraction framework for relational databases so as to support uncertainty in terms of natural language queries and assessments. More specifically, we aim at: the definition of a scheme that classifies non categorical attribute values into categories maintaining the classification belief. The term classification implies the procedure according to which each of a set of values is decided to belong into one of a set of related categories. As it is well known, in order to classify a data set there has to be a set of clusters as a result of a preceding clustering process. In this research effort we assume that each value that belongs to a cluster (category) should not be treated equally but to contribute according to its classification belief. Thus we also define a set of mapping functions assigned to the clusters as a result of an enhanced clustering process. Then each database value is mapped to a category bearing a degree of belief for this classification as a result of using the corresponding mapping function. We use fuzzy logic in order to represent and manipulate this belief. Also we define some information measures in order to exploit these beliefs and support decision making related to one or multiple data sets. the definition of a scheme for association rules extraction between attributes based on the above mentioned classification scheme. The association rules quality is enhanced by using the results of the classifcation scheme
	Coordinating Group
	UKDD
	Comment (optional)