How IQ Scoring Works: The Bell Curve, Z-Scores, and Percentiles
Here is a puzzle that trips up almost everyone who takes an IQ test: why is the average always exactly 100? Not 97, not 103, not whatever the raw score happened to be — always, stubbornly, 100. And if your score comes back as 115, what does that actually mean? Is it 15 points above average? 15% above average? A little better than most people? Way better than most people? The number itself doesn't tell you anything useful until you understand the machinery underneath it.
That machinery is built on a single beautiful idea from statistics: the normal distribution, better known as the bell curve. Once you see how IQ scoring rides on top of that curve — how raw test answers get converted into z-scores, how z-scores become IQ numbers, and how IQ numbers translate into percentiles — the whole system clicks into place. This guide walks you through the math step by step, with no prerequisites beyond middle-school algebra, so the next time you see an IQ number you'll know exactly what story it's telling.
The Bell Curve: IQ's Foundation
The normal distribution is one of those patterns that shows up everywhere in nature. Measure the heights of every adult in a country, plot them on a graph, and you'll see a smooth hill: most people clustered around the middle, fewer people at the extremes, and a symmetrical shape on both sides. Do the same with shoe sizes, blood pressure readings, or the length of time it takes a random person to react to a flashing light, and you get the same bell-shaped curve. Mathematicians call it "normal" because it's the default shape messy biological and psychological data tends to settle into.
IQ scoring uses this shape on purpose. When test designers develop an intelligence test, they don't expect everyone to score the same — they expect scores to spread out in a predictable way, with a dense peak near the average and thin tails at the extremes. The bell curve is tall and fat in the middle because most people's raw performance lands within a fairly narrow band. It tapers sharply at the edges because very high and very low scores are, by design, rare. This isn't a coincidence or a bias in the test. It's the whole point of the scoring system: IQ is built so that extreme scores are statistically uncommon.
Visually, imagine a hill. The exact center of the hill is labeled 100. To the right, the ground rises slightly, then drops as you climb past 115, 130, 145. To the left, it mirrors the same slope down through 85, 70, and 55. The higher or lower you go from the center, the less land there is under your feet — meaning fewer people score there. Every IQ number you've ever heard is just a point somewhere on this hill.
Standardization: Setting 100 as the Mean
So where does the number 100 come from? It's not pulled from a formula or some universal constant of intelligence. It's a deliberate convention set during a process called standardization. When a psychologist develops a new IQ test, the first thing they do is recruit a large, representative sample of people — thousands of test-takers chosen to mirror the population the test is meant to serve. Age, gender, education, and geography are all balanced so the sample looks like a miniature version of the real world.
That sample takes the test under controlled conditions, and the researchers record every raw score — the actual number of questions each person answered correctly, weighted however the test demands. Then they look at the distribution of those raw scores and draw the bell curve that fits them. Whatever the average raw score happens to be — maybe 73 correct answers, maybe 148, it doesn't matter — that average gets renamed "100." The spread of scores around that average gets rescaled so that one standard deviation of the raw spread equals 15 points on the new IQ scale (for Wechsler-style tests, anyway — more on different SD choices below).
From that point forward, every person who takes the test is compared not to an absolute standard of intelligence but to that original reference group. An IQ of 100 literally means "you scored at the exact middle of the standardization sample." An IQ of 115 means "you scored one standard deviation above the standardization sample's average." The score is always relative. This is why IQ scores are sometimes said to be normed against a population — because without that reference group, the number 100 means nothing at all.
Z-Scores: The Missing Middle Step
Between your raw test score and your final IQ number there's an intermediate value almost nobody talks about: the z-score. Z-scores are the universal language of statistics for comparing any measurement to any normal distribution. A z-score tells you, simply, how many standard deviations a value sits above or below the mean. A z of 0 means exactly average. A z of +1 means one standard deviation above average, whatever the original units were. A z of -2 means two standard deviations below.
The beauty of z-scores is that they strip away the arbitrary units of the original measurement. If you know someone's height z-score is +1.5, you know they're pretty tall regardless of whether the original data was in centimeters, inches, or cubits. The same move works for IQ: your raw test score can be converted into a z-score first, and from there into any IQ scale you like.
Z-Score Formula
z = (x − μ) / σWhere x = your raw score, μ = the population mean raw score, σ = the population standard deviation of raw scores
Here's a quick example. Suppose a test's standardization sample had an average raw score of 60 correct answers, with a standard deviation of 10. You take the test and get 75 correct. Plug in: z = (75 − 60) / 10 = +1.5. You scored one and a half standard deviations above the reference average. That single z-score is the bridge between the raw mess of your test performance and the familiar IQ number you're about to see.
Deviation IQ: From Z-Score to Familiar Number
Raw z-scores are great for statisticians, but they don't feel intuitive for most people. Saying "my z-score is +1" is technically meaningful but emotionally flat. That's where the deviation IQ formula comes in. It takes a z-score and rescales it onto a friendlier number line — one centered on 100, spreading out in convenient chunks of 15 points per standard deviation.
Deviation IQ Formula
IQ = 100 + (SD × z)Where z is the z-score from the standardization sample and SD is 15 for Wechsler, 16 for older Stanford-Binet, or 24 for Cattell CFIT
Let's walk through that z = +1.5 from the previous section, assuming we're using Wechsler conventions (SD = 15). Plug into the formula: IQ = 100 + (15 × 1.5) = 100 + 22.5 = 122.5. Round to 123, and that's the IQ number you'd see on your results page. Notice what happened: the math is completely transparent. Your "one and a half standard deviations above average" became a single tidy number that most people can interpret without thinking about statistics at all.
A z-score of exactly +1 becomes IQ 115. A z of +2 becomes IQ 130. A z of -1 becomes IQ 85. A z of 0 — exactly average — becomes the canonical IQ 100. The whole point of this conversion is to take a statistical concept most people don't instinctively grasp (standard deviations) and dress it up as a number most people can compare to a peer group (IQ).
Why Different Tests Use Different SDs
You might have noticed that the deviation IQ formula has a choice baked into it — the value of SD. Different tests use different numbers, and it's worth understanding why. The Wechsler family of tests (WAIS for adults, WISC for children) and the modern Stanford-Binet 5 all use SD = 15. Older versions of the Stanford-Binet, before it was revised, used SD = 16. And the Cattell Culture Fair Intelligence Test (CFIT) famously uses SD = 24.
None of these choices is more "correct" than the others. They're historical conventions, each made by test designers who wanted slightly different things from their scoring scale. A smaller standard deviation (15 or 16) produces a tighter distribution where most scores cluster in a narrow range, making small differences at the extreme tails look more dramatic. A larger standard deviation (24, like Cattell) spreads scores out more, so the same raw performance translates into a flashier number at the high end.
This is why comparing IQ scores across tests is tricky. Someone who scores at the 98th percentile — a common bar for high-IQ societies like Mensa — will see different numbers depending on which test they took. On a Wechsler-scale test (SD 15), the 98th percentile lands around IQ 130. On an older Stanford-Binet (SD 16), it's about IQ 132. On a Cattell CFIT (SD 24), it jumps to around IQ 148. The underlying cognitive ability is the same; only the scaling changed. If you want to compare across tests, the percentile is the honest number. The IQ is a rescaling of that percentile through an arbitrary lens.
The 68-95-99.7 Rule in Practice
There's a rule every statistics student learns on day one about normal distributions, and it applies directly to IQ: the 68-95-99.7 rule, also called the empirical rule. It says that in any normal distribution, about 68% of the data falls within one standard deviation of the mean, about 95% falls within two standard deviations, and about 99.7% falls within three. This is a mathematical fact about the shape of the bell curve, not a feature of any particular test.
Translated into Wechsler-scale IQ (mean 100, SD 15), the rule paints a vivid picture of how common or rare different IQ levels really are:
| Range | Wechsler IQ | Approximate % of population |
|---|---|---|
| Within ±1 SD | 85 – 115 | ~68% |
| Within ±2 SD | 70 – 130 | ~95% |
| Within ±3 SD | 55 – 145 | ~99.7% |
That last row is worth staring at for a moment. Only about 0.3% of people score outside the range 55 to 145 on a Wechsler-style test. That's roughly 3 people in a thousand — on both tails combined. Scores at 160 or 170 are not only rare; they're statistically so unusual that most tests can't even measure them reliably (we'll come back to this under ceiling effects).
Percentiles: What 'Above Average' Actually Means
A percentile is the number that actually answers the question you probably care about: "How do I compare to everyone else?" Specifically, your percentile is the fraction of the reference population that scored at or below your score, expressed as a number from 0 to 100. An IQ at the 84th percentile means you scored higher than 84% of the standardization sample. Percentiles are the pure, scale-free way to talk about where you stand.
Mathematically, a percentile is the area under the bell curve to the left of your score. Since the total area under any probability curve adds up to 1 (or 100%), slicing off everything to the left of your score gives you your percentile directly. The center line at IQ 100 cuts the hill exactly in half, so IQ 100 = 50th percentile. As you move right, the leftward area grows, and your percentile climbs. Here are the landmark values on a Wechsler-scale test:
| IQ (Wechsler, SD 15) | Percentile | Plain English |
|---|---|---|
| 70 | ~2nd | Higher than about 2% of people |
| 85 | ~16th | Higher than about 16% of people |
| 100 | 50th | Exactly average |
| 115 | ~84th | Higher than about 84% of people |
| 130 | ~98th | Higher than about 98% of people |
| 145 | ~99.9th | Higher than about 99.9% of people |
This is why an IQ of 115 feels more impressive than "15 points above average" sounds: it means you scored higher than roughly five out of every six people in the reference group. It also explains why the jump from 130 to 145 is so much more dramatic than the jump from 100 to 115, even though both are 15 points. The bell curve is dense in the middle and thin at the edges, so every extra point at the tail represents a much larger leap in rarity. If you want a deeper look at how different tests set their own percentile ranks, our comparison of common IQ tests walks through the trade-offs side by side.
A Brief History: From Stern's Quotient to Wechsler's Deviation
The word "quotient" in "intelligence quotient" is a historical fossil. It dates to 1912, when German psychologist William Stern proposed a way to turn mental age — a concept borrowed from Alfred Binet's early intelligence tests — into a single number. Stern's idea was to divide a child's measured mental age by their chronological age and multiply by 100. A 10-year-old with the reasoning ability of an average 12-year-old had a mental quotient of 12/10 × 100 = 120. A 10-year-old with the reasoning of an average 8-year-old scored 8/10 × 100 = 80.
The 100 in Stern's formula is where the modern IQ's centered-on-100 convention comes from — an exact match between mental age and chronological age gives 100 by construction. Lewis Terman adapted Stern's quotient for his Stanford-Binet intelligence scale, which became the dominant IQ test in the English-speaking world through the 1920s and 1930s. For a while, every IQ score in the world really was a ratio of ages.
The ratio worked reasonably well for children, whose cognitive abilities grow predictably with age. But it fell apart for adults. Mental ability doesn't keep climbing linearly forever — it plateaus in early adulthood and then shifts in complicated ways. A 40-year-old with the reasoning of "a 60-year-old" isn't 50% smarter than average; the concept just doesn't make sense. David Wechsler spotted this problem and in 1939 published the Wechsler-Bellevue Intelligence Scale, which replaced the ratio quotient with something new: the deviation IQ. Instead of dividing by age, Wechsler compared each person to the standardization sample of their own age group and reported how far from that group's average they landed, measured in standard deviations. Every modern major IQ test uses some version of Wechsler's deviation-IQ approach. For the fuller history of Wechsler's scales and how they evolved into today's WAIS and WISC, our Wechsler IQ test guide goes into detail.
Scoring Limitations
The bell curve is an elegant model, but no real test is a perfect instrument. A few limitations are worth knowing about if you want to interpret IQ scores honestly.
- Ceiling effects. Every test has a maximum possible raw score — you can't answer more questions than exist. Once a taker nears that ceiling, the test can no longer distinguish them from other high scorers. This is why most online IQ tests (including ours) cap reported scores at 145 or so: above that point, the test simply doesn't have the measurement resolution to say anything meaningful.
- Floor effects. The mirror image. Tests also have a minimum possible score, and when someone scores at or near zero on the raw test, the conversion to IQ becomes a statistical shrug. A cautious test reports a minimum bound rather than pretending to know the exact IQ of someone who couldn't engage with the questions.
- Confidence intervals. Every measurement has error bars. Clinical IQ tests administered by a trained psychologist typically report a score as a point value plus or minus a margin representing a 95% confidence interval — acknowledging that if you took the same test on a different day you'd probably get a slightly different number. A reported IQ is a best estimate, not a fingerprint.
- Drifting reference populations. The standardization sample that anchors the number 100 isn't frozen in time. Measured IQ scores have risen across generations in many countries, a phenomenon known as the Flynn effect. That means older norms gradually over-score modern test-takers, and tests need periodic re-norming to stay honest. For the full story of why scores keep creeping upward and what it means, our explainer on the Flynn effect is the place to start.
None of these limitations invalidates the underlying math. The bell curve, the z-score, and the deviation formula are all sound. They're just a reminder that the clean statistical model sits on top of a messy real-world measurement, and any good interpretation of an IQ score respects that gap.
How Our Tests Score You
Now that you know the full pipeline, here's exactly how our free online IQ test and its variants turn your clicks into a number. First, your raw accuracy on the test — how many questions you answered correctly, weighted by difficulty — gets compared to the expected distribution of raw scores for the test. That comparison produces a z-score: a measure of how many standard deviations your performance sits above or below the expected average. From there, the deviation IQ formula kicks in with SD = 15, the Wechsler convention, giving you a familiar-looking IQ number centered on 100.
Finally, the result is clamped between roughly 55 and 145 — the ±3 SD range that covers about 99.7% of the population. Scores outside that window exist in theory, but no short online test has the resolution to measure them reliably, so we refuse to invent precision we don't have. That honest ceiling is the reason you won't see a 170 come out of our test even if you answer every question correctly: the math behind this article tells us we shouldn't pretend we can measure that high.
If you'd like to try it for yourself — or see how your score maps back onto the bell curve, z-score, and percentile concepts you just learned — our main IQ test is free, takes about 10–15 minutes, and returns a full breakdown with your percentile, classification, and where you sit on the distribution. Understanding the math first makes the result a lot more useful when it arrives.
Try Our IQ Test
Take a free online IQ test with 18 timed questions across pattern recognition, number sequences, verbal analogies, and logical reasoning. Get your estimated IQ score, percentile rank, bell curve visualization, and score comparison across Wechsler, Stanford-Binet, and Cattell scales.
Open CalculatorRelated Articles
The Flynn Effect: Why Average IQ Scores Have Been Rising for a Century
Average IQ scores have been rising for 100 years. The reason has nothing to do with getting smarter.
MathEvery IQ Test Type Compared: Mensa, Raven's, Wechsler, and More
Mensa, Wechsler, Raven's, culture-fair: the major IQ tests side-by-side, with the trade-offs explained.
MathThe Wechsler IQ Test: History, Methodology, and Modern Use
David Wechsler's scale has defined modern IQ testing since 1955. Here's how it actually works.