Experimental Evaluation of Automated and Manual Data Cleaning Systems: A Case Study Using Organizational Data

Authors

DOI:

https://doi.org/10.65278/IJTACI.2026.42

Keywords:

Data cleaning, automation, missing values, data quality, comparative analysis

Abstract

This paper presents a comprehensive comparison between computerized and manual data cleaning methods, using different types of outliers’ statistics, including missing values, wrong figures, and inconsistent formats. This view provides a more holistic comparison between the automated and guidance-based information-cleaning methodologies in terms of their efficiency along the three most important dimensions of information quality. Leveraging a common Kaggle dataset of 50,000 employees’ details, we create a systematic evaluation framework with diagnostic measures, fine-grained diagnostics and resource utilization profiles. The experiments are conducted quantitatively in order to evaluate the cleaning on a given distribution of types of anomalies: 30% missing values (15,000 records), 25% incorrect values (12,500 records) and 45% inconsistent formats (22,500 records). Automated cleaning significantly helped normalize mixed formats (92% success) and regarding invalid statistics (88%). However, manual cleaning was better than automatic methods on complex cases and context-dependent learning, with a 95% accuracy in area-knowledge-required cases. The paper shows clear benefits for both approaches: (i) automated cleansing excels at both speed and cost on large sets, alongside hand-cleaning is useful in difficult examples that require domain knowledge. These findings add value to the literature, by presenting empirical evidence of effectiveness for both approaches and providing firms with a structured, knowledge-based filter for choosing suitable cleaning solutions in line with their actual state of knowledge, needs and organizational structure.

Downloads

Download data is not yet available.

Downloads

Published

2026-03-09

How to Cite

Elmobark, N., Abdzaid, A. Y., Alaa, F., & Saad, A. (2026). Experimental Evaluation of Automated and Manual Data Cleaning Systems: A Case Study Using Organizational Data. International Journal of Theoretical & Applied Computational Intelligence, 2026, pp. 76–103. https://doi.org/10.65278/IJTACI.2026.42

Issue

Section

Articles

Similar Articles

1 2 > >> 

You may also start an advanced similarity search for this article.