Empirical Investigation of Type 1 Error Rate of Some Normality Test Statistics
Normality assumption is important in many parametic statistical tests. Either the varibles or the error terms in the model have to be assumed to be normally distributed before statistical conclusions can be made. Various statistical tests which include that ofPearson (1900, 1905),Kolmogorov–Smirnov (1933), Anderson-Darling (1954), Shapiro–Wilk (1965),Lilliefor (1967),D’Agostino and Pearson (1973), Jarque-Bera (1987),Shapiro-Franca (1992),Energy (Szekeley and Rizzo, 2005)and Cramer-von Mises (Thadewalid and Buning, 2007) have been developed to test for normality of a set of data. However, when applied in practice, they hardly lead to the same conclusion. This is a serious challenge to practioners. Consequently, this research work aims at investigating the Type1 error rate of some of the nomality statistics so as to identify the best one and recommed the same for statistics users. Monte Carlo experiments were conductedfive thousand (5000) times with six sample sizes (n =20, 50, 100, 250 and 500) at three pre-selected levels of significance (=α0.01, 0.05 and0.1). A statistic was considered good if its estimated Type 1 error rate approximated the pre-selected level of significance, and was considered best if its number of counts at which it was good over the three (3) levels of significance and six (6) sample sizes was the highest. Results show that Type 1 error rate of all the statistics are goodexcept that of Kolmogorov–Smirnov, Pearson Unadjusted and Jarque-Bera. The Ominibus test statistics is only good at 0.1 level of significance. In general, the Type 1 error rate of Anderson-Darling,Shapiro-Wilk,Energy, Cramer-vonMises test statistics are best. These are followed by that of Shapiro-Franca and Lilliefortest statistics . Consequently, Anderson-Darling, Shapiro–Wilk, Energy and Cramer-VonMises test statistics are recommended for use in test of normality of a data set.