Standard Deviation: Metric of Result Dispersion

Standard deviation (σ) is a fundamental statistical metric characterizing the degree of dispersion of random variable values around the mathematical expectation. Formally, σ is defined as the square root of the variance: σ = √(D[X]) = √(E[(X − μ)²]), where μ = E[X]. For the sample standard deviation, Bessel’s correction is applied: s = √(Σ(xᵢ − x̄)² / (n − 1)), ensuring the unbiasedness of the variance estimate. The dimension of σ matches the dimension of the original variable, which makes it more intuitively interpretable compared to variance.

In the context of a normal distribution N(μ, σ²), the standard deviation acquires a special interpretation through the empirical rule (three-sigma rule). According to this rule: P(μ − σ ≤ X ≤ μ + σ) ≈ 0.6827, P(μ − 2σ ≤ X ≤ μ + 2σ) ≈ 0.9545, P(μ − 3σ ≤ X ≤ μ + 3σ) ≈ 0.9973. This means that for normally distributed data, going beyond 3σ is an event with a probability of less than 0.27%, which may indicate a systematic error or anomaly. For distributions other than normal, similar probabilistic boundaries are established via Chebyshev’s inequality: P(|X − μ| ≥ kσ) ≤ 1/k².

Confidence intervals for the mathematical expectation are constructed based on the standard deviation and sample size. With a known σ and normal distribution: μ ∈ [x̄ − z_{α/2}·σ/√n, x̄ + z_{α/2}·σ/√n] with probability 1 − α. With an unknown σ, Student’s t-distribution with (n − 1) degrees of freedom is applied: μ ∈ [x̄ − t_{α/2,n−1}·s/√n, x̄ + t_{α/2,n−1}·s/√n]. The width of the confidence interval is inversely proportional to √n, which determines the minimum sample size to achieve a given precision. To estimate σ with a relative error of δ, at least n ≈ 2/δ² observations are required.

Practical application of the standard deviation in analytical systems includes estimating short-term volatility and constructing forecasting models. The standard deviation of the sum of n independent identically distributed variables is σ_sum = σ·√n, which allows for forecasting the range of possible results for a series of iterations. The Z-score of a specific result is calculated as z = (x − μ·n) / (σ·√n) and shows how much the result has deviated from the expected one. Critical values of the z-score (|z| > 3) signal statistically anomalous deviations that require additional analysis of the generation algorithm integrity.

Verify Mathematical Equations