Paper Title
Effect of Data Transformation in Outlier Detection

Unclean data with outliers leads to distorted conclusions resulting in faulty products. Analysis of the data has to be carried out for any misfits which need to be either deleted or winsorised. Quality of a data on material characteristics, determines the accuracy in the prediction of performance of products. Reported nonperformance of models in a research data is considered in this paper, for gaging the reason for the model inaccuracy. Presence of outliers in the data containing characteristics of different types of fly ash is assessed, using Tukey‟s traditional boxplot, and two other medcouple based adjusted boxplots. All the three methods indicated presence of outliers in the data under study. Histogram and density plots of the characteristics revealed skewness in the data, which need to be accommodated in models by way of data transformations. Comparison of outlier detection for the actual and transformed data exposed the significance of data transformation in outlier detection. Keywords - BoxPlot, Data Transformation , Histogram, Interquartile Range, Medcouple, Fence Constants