A framework to Deal with Missing Data in Data Sets

Luai A. Shalabi; Mohannad Najjar; Ahmad A. Kayed

doi:10.3844/jcssp.2006.740.745

Research Article Open Access

A framework to Deal with Missing Data in Data Sets

Luai A. Shalabi, Mohannad Najjar and Ahmad A. Kayed

Abstract

Most information systems usually have some missing values due to unavailable data. Missing values minimizing the quality of classification rules generated by a data mining system. Missing vales also affecting the quantity of classification rules achieved by the data mining system. Missing values could influence the coverage percentage and number of reducts generated. Missing values lead to the difficulty of extracting useful information from that data set. Solving the problem of missing data is of a high priority in the field of data mining and knowledge discovery. Replacing missing values by a specific value should not affect the quality of the data. Four different models for dealing with missing data were studied. A framework is established that remove inconsistencies before and after filling the attributes of missing values with the new expected value as generated by one of the four models. Comparative results were discussed and recommendations were concluded.

Journal of Computer Science

Volume 2 No. 9, 2006, 740-745

DOI: https://doi.org/10.3844/jcssp.2006.740.745

Submitted On: 31 May 2006 Published On: 30 September 2006

How to Cite: Shalabi, L. A., Najjar, M. & Kayed, A. A. (2006). A framework to Deal with Missing Data in Data Sets. Journal of Computer Science, 2(9), 740-745. https://doi.org/10.3844/jcssp.2006.740.745

Copyright: © 2006 Luai A. Shalabi, Mohannad Najjar and Ahmad A. Kayed. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

3,506 Views
2,893 Downloads
13 Citations

Download

Keywords

Data mining
missing data
rules
reducts
coverage