What Is Standard Deviation?
Standard deviation is one of those concepts where the intuition matters more than the arithmetic — your whole class scores between 78 and 82 on a test and the SD barely breaks 1, but scores spread from 40 to 100 and suddenly the SD is north of 18, and that single gap between those two numbers tells you more about how a dataset behaves than the mean ever could on its own. The formula itself is straightforward, but what trips people up is not the math — it is building a feel for what a "big" or "small" SD actually looks like in the context of their own data.
The best way to build that intuition is to look at real distributions side by side — two histograms with the same mean but wildly different shapes make the concept click faster than any formula derivation.
The Formulas
Population Standard Deviation (σ)
σ = √( Σ(xᵢ - μ)² / N )
Use when you've got the full dataset – every test score from a class or all player stats on a roster.
Sample Standard Deviation (s)
s = √( Σ(xᵢ - x̄)² / (n - 1) )
Use when you're dealing with a subset of a larger group. Dividing by (n - 1) instead of n is Bessel's correction – it fixes the bias.
How to Calculate Standard Deviation by Hand
- Find the mean (average) of the data: add all values and divide by the count.
- Subtract the mean from each value to get the deviation from the mean.
- Square each deviation (this removes negative signs and gives more weight to larger deviations).
- Find the average of the squared deviations. Divide by N for population, or by (n - 1) for a sample. This gives you the variance.
- Take the square root of the variance. The result is the standard deviation.
Our calculator above does all of this automatically and shows you each step. Click "Show step-by-step solution" after entering your data.
Population vs. Sample: Which One Do I Use?
Population standard deviation (σ) divides by N and covers situations where you have the entire group — every test score from a class, every player's height on a full roster. The moment you are working with a subset, that formula starts lying to you — using N in the denominator when you should be using (n-1) systematically overstates how confident the results deserve to be.
Sample standard deviation (s) kicks in whenever you are dealing with a subset of a larger group — survey 500 households out of 120,000 in a county and there is no version of that math where you can pretend you have the full population. Bessel's correction — dividing by (n-1) instead of n — exists because without it the estimate systematically underestimates the true spread, and that bias gets worse the smaller your sample is. If you are not measuring the entire population, you are working with a sample, and the sample formula is the one you want.
Stuck? Homework and research usually rely on sample standard deviation. When in doubt, go with (s).
Common Mistakes
- Bessel's correction mix-up: Dividing by n when you should use (n-1) is one of the most common errors in intro stats courses — every confidence interval downstream of that mistake comes out too narrow, and the smaller your sample the worse the distortion gets.
- Mixing up SD and variance: Variance is SD squared, so the units change — heights measured in centimeters produce a variance in cm², and reporting that as standard deviation makes no physical sense. The APA Publication Manual (7th ed.) flags this specifically because mislabeled units undermine the entire results section.
- Comparing SDs across different scales: An SD of 4.2 on a 1-to-50 scale and an SD of 4.1 on a 1-to-10 scale look equal until you realize the second one represents four times more relative variability. The coefficient of variation (CV = SD/mean × 100%) handles this — it normalizes spread relative to the center so you can compare across completely different measurement systems.
- Assuming normality: The 68-95-99.7 rule only holds for data that is approximately normally distributed, and applying it to a right-skewed distribution produces prediction intervals that miss badly on one side. Chebyshev's theorem gives you the distribution-free fallback — at least 75 percent of values within two SDs regardless of shape, which is a weaker guarantee but one that never lies to you.
Frequently Asked Questions
What is standard deviation?
The simplest way to think about it: standard deviation measures the average distance each data point sits from the center of the set. Low SD means everything clusters tight around the mean, high SD means the values scatter across a wide range — and that single number tells you more about the shape of your data than the average alone ever could.
What is the difference between population and sample standard deviation?
Population SD (σ) divides by N and only works when you have every single value in the group — the entire class, the full roster. Sample SD (s) divides by (n-1) instead, which is Bessel's correction for the bias that creeps in when you estimate spread from a subset. The safe default is always sample SD unless you are genuinely certain you have the complete population.
When should I use standard deviation vs variance?
SD keeps the same units as your original data — heights in centimeters stay in centimeters — which makes it far more intuitive to read in a report. Variance squares everything, so it ends up in cm-squared, which is why formulas love it but humans find it confusing. The APA Publication Manual (7th ed.) recommends SD over variance when communicating results to anyone who is not a statistician.
What is a "good" standard deviation?
There is no single magic number — context decides everything. The coefficient of variation (CV = SD/mean × 100%) is the real tool here because it normalizes the spread relative to the center. Under 15-20 percent CV generally signals low variability, though clinical labs tighten that to below 5 percent because lives depend on the precision.
Can I calculate standard deviation in Excel?
Use =STDEV.S() for sample SD and =STDEV.P() for population SD — for example, =STDEV.S(A1:A10) gives you the sample standard deviation of cells A1 through A10. The legacy STDEV() function defaults to sample SD, which catches people off guard when they expected population.
What does the 68-95-99.7 rule mean?
For normally distributed data, roughly 68 percent of values fall within one SD of the mean, 95 percent within two, and 99.7 percent within three — but apply that rule to right-skewed data like hospital billing and your prediction intervals miss badly on one side. Chebyshev's theorem gives you the distribution-free fallback: at least 75 percent within two SDs regardless of shape.