Standard Practice for Application of Generalized Extreme Studentized Deviate (GESD) Technique to Simultaneously Identify Multiple Outliers in a Data Set
同时识别数据集中多个异常值的广义极端学生偏差(GESD)技术应用的标准实施规程
1.1
This practice provides a step by step procedure for the application of the Generalized Extreme Studentized Deviate (GESD) Many-Outlier Procedure to simultaneously identify multiple outliers in a data set. (See Bibliography.)
1.2
This practice is applicable to a data set comprising observations that is represented on a continuous numerical scale.
1.3
This practice is applicable to a data set comprising a minimum of six observations.
1.4
This practice is applicable to a data set where the normal (Gaussian) model is reasonably adequate for the distributional representation of the observations in the data set.
1.5
The probability of false identification of outliers associated with the decision criteria set by this practice is 0.01.
1.6
It is recommended that the execution of this practice be conducted under the guidance of personnel familiar with the statistical principles and assumptions associated with the GESD technique.
1.7
This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use.
1.8
This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
====== Significance And Use ======
3.1
The GESD procedure can be used to simultaneously identify up to a pre-determined number of outliers (
r
) in a data set, without having to pre-examine the data set and make
a priori
decisions as to the location and number of potential outliers.
3.2
The GESD procedure is robust to masking. Masking describes the phenomenon where the existence of multiple outliers can prevent an outlier identification procedure from declaring any of the observations in a data set to be outliers.
3.3
The GESD procedure is automation-friendly, and hence can easily be programmed as automated computer algorithms.