Abstract:
This paper presents an outlier detection technique for univariate normal datasets. Outliers are observations that
lips an abnormal distance from the mean. Outlier detection is a useful technique in such areas as fraud detection, financial
analysis, health monitoring and Statistical modelling. Many recent approaches detect outliers according to reasonable, predefined concepts of an outlier. Methods of outlier detection such as Gaussian method of outlier detection have been widely
used in the detection of outliers for univariate data-sets, however, such methods use measure of central tendency and dispersion
that are affected by outliers hence making the method to be less robust towards detection of outliers. The study aimed at
providing an alternative method that can be used in outlier detection for univariate normal data sets by deploying the measures
of variation and central tendency that are least affected by the outliers (median and the geometric measure of variation). The
study formulated an outlier detection formula using median and geometric measure of variation and then applied the
formulation on randomly simulated normal dataset with outliers and recorded the number of outliers detected by the method in
comparison to the other two existing best methods of outlier detection. The study then compared the sensitivity of the three
methods in outlier detection. The simulation was done in two different ways, the first considered the variation in mean with a
constant standard deviation while the second test held the mean constant while varying the standard deviation. The formulated
outlier detection technique performed the best, eliminating the most required number of outliers compared to other two
Gaussian outlier detection techniques when there was variation in mean. The study also established that the formulated method
of outlier detection was stricter when the standard deviation was varied but still stands out to be the best as an outlier is defined
relative to the mean and not the standard deviation. The study established that the formulated method is more sensitive than the
Gaussian Method of outlier detection but performed as well as the best existing outlier detection technique. In conclusion, the
study established that the formulated method could be employed in outlier detections for univariate normal data-sets as it
performed almost the same to the best existing method of outlier detection for univariate data-sets.
Keywords: Outlier, Anomaly, Outlier Detection, Gaussian