topic badge
United States of AmericaVA
Algebra, Functions, and Data Analysis

3.03 Probability and the normal curve

Normal distribution probability

When data follows a pattern called a normal probability distribution, then approximately 68\% of the observations will be within one standard deviation from the mean, 95\% of the observations will be within two standard deviations of the mean, and 99.7\% will be closer than three standard deviations from the mean.

This means most data points cluster around the average (mean), and fewer data points are found as they get further away from the average. The graph of a normal distribution looks like a bell shape.

A graph of a normal curve with standard deviations labeled from -3 to positive 3. Area between -2 and positive 2 is labeled 95% in this region. Area between -1 and positive 1 is labeled 68% in this region.

Think of a data set as a collection of measurements (like test scores, heights, etc.). To compare scores from different situations easily, we often standardize them. This uses the mean and standard deviation of the original data.

To convert a raw data value (x) from a population with a known mean (\mu) and standard deviation(\sigma) into a standardized score, or z-score, we use this formula:

\displaystyle z=\dfrac{x-\mu}{\sigma}
\bm{z}
The z-score, which tells us how many standard deviations a value is from the mean
\bm{x}
The individual data value (raw score)
\bm{\mu}
The population mean
\bm{\sigma}
The population standard deviation

A negative z-score indicates the data value is below the mean, while a positive z-score indicates it is above the mean. A z-score of 0 means the value is exactly the mean.

The standard normal distribution is a special normal distribution with a mean (\mu) of 0 and a standard deviation(\sigma) of 1. When we calculate z-scores, we are converting our data to fit this standard scale.

An important rule is that the total area under any normal curve always equals 1 (or 100\%). This represents the total probability of all possibilities.

We can represent the probability of an event occurring within a certain range as the area under the standard normal curve corresponding to that range of z-scores. For example:

  • The probability that a z-score is less than a specific value a, written as P(Z \lt a), corresponds to the area under the curve to the left of a.
  • The probability that a z-score is greater than a specific value b, written as P(Z \gt b), corresponds to the area under the curve to the right of b.
  • The probability that a z-score falls between two values a and b, written as P(a \lt Z \lt b), corresponds to the area under the curve between a and b.

Visualizing these areas helps in understanding and calculating probabilities.

Three standard normal curves. The first shows the area to the left of 'a' shaded, labeled P(Z < a). The second shows the area between 'a' and 'b' shaded, labeled P(a < Z < b). The third shows the area to the right of 'b' shaded, labeled P(Z > b).

The Empirical Rule (68-95-99.7 rule) seen earlier describes the approximate areas within 1, 2, and 3 standard deviations of the mean for any normal distribution, including the standard normal distribution (where the standard deviations correspond directly to z-scores of \pm 1, \pm 2, and \pm 3).

Tables for the standard normal distribution, often called z-tables, allow us to find the area under the curve associated with specific z-scores, which corresponds to probabilities. Different tables show areas differently. The table used in the examples in this section shows the area between the mean (z=0) and a positive z-score. To find other areas (probabilities), we need to use the properties of the normal curve:

  • The curve is symmetrical about the mean (z=0).
  • The total area under the curve is 1.
  • The area to the left of the mean is 0.5, and the area to the right is 0.5.

Using these properties and the table (which gives the area from 0 to z), we can find:

  • Area for P(Z \lt z) (a positive z): Add 0.5 to the table value. (Think: Area of the entire left half (0.5) plus the piece from 0 to z given by the table).
  • Area for P(Z \gt z) (a positive z): Subtract the table value from 0.5. (Think: Area of the entire right half (0.5) minus the piece from 0 to z given by the table).
  • Area for P(Z \lt -z) (a negative z): Use symmetry. This area is the same as P(Z \gt z). So, calculate: 0.5 - \text{TableValue}(z).
  • Area for P(Z \gt -z) (a negative z): Use symmetry. This area is the same as P(Z \lt z). So, calculate: 0.5 + \text{TableValue}(z).
  • Area between two z-scores: This depends on whether the z-scores are positive or negative. It might involve adding or subtracting table values. For example, P(a \lt Z \lt b), where both are positive, is \text{TableValue}(b) - \text{TableValue}(a). If it was P(-a \lt Z \lt b), add \text{TableValue}(a) + \text{TableValue}(b).

Examples

Example 1

The table shows the area under the standard normal curve between 0 and a given z-score. Use this table to find the probability that a variable has a z-score less than z=0.84. Give your answer to four decimal places.

A table showing the area under the standard normal curve between 0 and a given z-score. Rows are tenths of z-scores, and columns are hundredths.
Worked Solution
Create a strategy
A standard normal curve showing the area from z=0 to z=0.84 shaded (table value) and the area left of z=0 shaded (0.5). The total shaded area represents P(Z < 0.84).

\text{ }

The probability needed is P\left(Z<0.84\right).

This represents the area under the standard normal curve to the left of z=0.84. The table gives the area between z=0 and z=0.84. Since we want the total area to the left, and the area to the left of z=0 is 0.5, we need to add 0.5 to the value found in the table.

Apply the idea

Using the table, find the row 0.8 and the column 0.04.

A z-score table. The value 0.2996 is circled at the intersection of the 0.8 row and the 0.04 column.

The table value, 0.2996, represents the area between z=0 and z=0.84. As per the strategy, we add the area left of the mean (0.5) to this value:

P(Z \lt 0.84) = 0.5 + 0.2996 = 0.7996

The probability is 0.7996, or 79.96\%.

Example 2

A sprinter is training for a national competition. She runs 400\text{ m} in an average time of 75 seconds, with a standard deviation of 6 seconds.

Use the table showing the area under the standard normal curve between 0 and a given z-score to answer the questions.

A table showing the area under the standard normal curve between 0 and a given z-score. Rows are tenths of z-scores, and columns are hundredths.
a

Determine the z-score of a time of 67 seconds. Round your answer to two decimal places.

Worked Solution
Create a strategy

To find the z-score, we can use the formula z=\dfrac{x-\mu}{\sigma} with the given time in seconds (x=67), the mean (\mu=75), and the standard deviation (\sigma=6) of time in seconds.

Apply the idea

Substitute the known values into the z-score formula:

\displaystyle z\displaystyle =\displaystyle \dfrac{67-75}{6}Substitute x, \mu, and \sigma
\displaystyle =\displaystyle \dfrac{-8}{6}Simplify numerator
\displaystyle =\displaystyle -1.333...Evaluate

Rounding to two decimal places, the z-score of a time of 67 seconds is -1.33.

b

Find the probability that the sprinter runs 400\text{ m} in less than 67 seconds. That is, find P\left(x<67\right). Round your answer to four decimal places.

Worked Solution
Create a strategy
Standard normal curve showing symmetry. The area left of z=-1.33 is shaded, and the equal area right of z=+1.33 is also indicated. Calculation shown as 0.5 minus the area from 0 to 1.33.

From part (a), the z-score for x=67 is z \approx -1.33. So we need to find P\left(Z\lt-1.33\right).

This represents the area under the standard normal curve to the left of z=-1.33. Because the curve is symmetrical, this area is equal to the area to the right of z=1.33. The table gives the area between z=0 and z=1.33. The total area to the right of z=0 is 0.5. Therefore, the area to the right of z=1.33 can be found by subtracting the table value from 0.5.

Apply the idea

Using the table for the positive value z=1.33, find row 1.3 and column 0.03.

A positive z-score table. The value 0.4082 is circled at the intersection of the 1.3 row and the 0.03 column.

The table value for z=1.33 is 0.4082. This represents the area between z=0 and z=1.33. As outlined in the strategy, we subtract this from 0.5 to find the area left of z=-1.33:

P(Z\lt-1.33) = P(Z>1.33) = 0.5 - P(0 \lt Z \lt 1.33)

P(Z \lt -1.33) = 0.5-0.4082=0.0918

The probability that it takes the runner less than 67 seconds to run 400\text{ m} is 0.0918, or 9.18\%.

c

Find the probability that the sprinter runs 400\text{ m} between 70 and 80 seconds. That is, find P(70 \lt x \lt 80). Round your answer to four decimal places.

Worked Solution
Create a strategy

To find the probability of the sprinter running between 70 and 80 seconds, we first need to convert these times into z-scores. We'll use the mean time of 75 seconds and a standard deviation of 6 seconds for this conversion.

The formula for calculating a z-score is z=\dfrac{x-\mu}{\sigma}, where x, is the specific time, \mu, is the mean, and \sigma is the standard deviation.

Once we have the z-scores for both 70 and 80 seconds, we can use a standard normal distribution table (z-table) to find the area under the curve between these two z-scores. This area represents the desired probability.

Since one z-score will be negative and the other positive, the probability will be the sum of the area between the negative z-score and the mean (z=0), and the area between the mean (z=0) and the positive z-score.

Due to the symmetry of the normal distribution, the area corresponding to a negative z-score is the same as the area for its positive counterpart in the z-table. Therefore, we will find the area associated with the absolute value of the negative z-score and the area associated with the positive z-score in the z-table and add these two areas together.

Apply the idea

First, let's calculate the z-score for a time of 70 seconds:

\displaystyle z_1\displaystyle =\displaystyle \dfrac{70-75}{6}Substitute x=70, \, \mu=75, \, \sigma=6
\displaystyle =\displaystyle \dfrac{-5}{6}
\displaystyle \approx\displaystyle -0.83Evaluate and round to two decimal places

Next, let's calculate the z-score for a time of 80 seconds:

\displaystyle z_2\displaystyle =\displaystyle \dfrac{80-75}{6}Substitute x=80, \, \mu=75, \, \sigma=6
\displaystyle =\displaystyle \dfrac{5}{6}
\displaystyle \approx\displaystyle 0.83Evaluate and round to two decimal places

We need to find the probability P(-0.83 \lt Z \lt 0.83). This probability is equal to the area under the standard normal curve between these two z-scores.

This area can be found by adding the area from Z = -0.83 to Z = 0 and the area from Z = 0 to Z = 0.83.

Using the z-table, the area between Z = 0 and Z = 0.83 is approximately 0.2967.

Due to the symmetry of the standard normal distribution, the area between Z = -0.83 and Z = 0 is the same as the area between Z = 0 and Z = 0.83, which is also 0.2967.

Therefore, the probability is:

P(-0.83 \lt Z \lt 0.83) = 0.2967 + 0.2967 = 0.5934

The probability that the sprinter runs between 70 and 80 seconds is approximately 0.5934, or 59.34\%.

d

The value 0.0918 represents the probability that:

A
The sprinter will run 400 \text{ m} in less than 67 seconds.
B
The sprinter will run 400 \text{ m} in exactly 67 seconds.
C
The sprinter will run 400 \text{ m} in more than 67 seconds.
Worked Solution
Create a strategy

Refer back to the probability statement calculated in part (b).

Apply the idea

In part (b), we calculated the probability P\left(x \lt 67\right), which means the probability that the time (x) is less than 67 seconds. This value was found to be 0.0918.

This represents the likelihood that a randomly selected run time for 400 \text{ m} will be less than 67 seconds. Therefore, the correct interpretation is option A.

Note: The probability of any single exact value (like running in exactly 67 seconds) in a continuous distribution like the normal distribution is effectively zero.

Example 3

The mean height of an adult male is 1.78\text{ m}, with a standard deviation of 9\text{ cm}. Assume heights are normally distributed.

a

Find the z-score of a height of 1.69\text{ m}.

Worked Solution
Create a strategy

To find the z-score, we can use the formula z=\dfrac{x-\mu}{\sigma} with the given height (x), the mean height (\mu), and the standard deviation of height (\sigma). Ensure all units are consistent before calculating.

Apply the idea

From the given information, we have x=1.69 \text{ m}, \mu=1.78 \text{ m}, and \sigma=9\text{ cm}.

Convert heights to centimeters (cm) so all units match the standard deviation. Recall 1 \text{ m} = 100 \text{ cm}.

\displaystyle x\displaystyle =\displaystyle 1.69 \text{ m} \cdot 100 \dfrac{\text{cm}}{\text{m}} = 169 \text{ cm}
\displaystyle \mu\displaystyle =\displaystyle 1.78 \text{ m} \cdot 100 \dfrac{\text{cm}}{\text{m}} = 178 \text{ cm}
\displaystyle \sigma\displaystyle =\displaystyle 9 \text{ cm}

Now calculate the z-score using values in cm:

\displaystyle z\displaystyle =\displaystyle \dfrac{169-178}{9}Substitute the values in cm
\displaystyle =\displaystyle \dfrac{-9}{9}Simplify numerator
\displaystyle =\displaystyle -1Evaluate

The z-score of a height of 1.69\text{ m} is -1.

b

If 700 adult males are chosen at random, find the approximate number of males who are taller than 1.69\text{ m}. Round your answer to the nearest whole number.

Worked Solution
Create a strategy

First, find the probability that a single randomly chosen male is taller than 1.69 \text{ m}. This corresponds to finding P(X > 1.69). From part (a), the z-score for 1.69 \text{ m} is z=-1. So, we need to calculate P(Z > -1) using a z-table.

Standard normal curve showing symmetry. The area right of z=-1 is shaded, and the equal area left of z=+1 is also indicated. Calculation shown as 0.5 plus the area from 0 to 1.

\\\\

This represents the area under the standard normal curve to the right of z=-1. Because the curve is symmetrical, this area is equal to the area to the left of z=1. The table gives the area between z=0 and z=1. The total area to the left of z=0 is 0.5. Therefore, the area to the left of z=1 can be found by adding 0.5 to the table value for z=1.

Once we have this probability, multiply it by the total number of males (700) to find the expected number of males taller than 1.69\text{ m}. Round the final answer to the nearest whole number.

Apply the idea
A z-score table. The value .3413 is circled and it is on the 1.0 row, .00 column.

As explained in the strategy, we need to find P(Z \gt -1), which by symmetry is equal to P(Z \lt 1).

Using the table, find the area corresponding to z=1.00 (Row 1.0, Column 0.00). The table value is 0.3413.

This table value (0.3413) is the area between z=0 and z=1.00. To get the total area to the left of z=1.00 (which equals the area right of z=-1), we add the area left of the mean (0.5):

P(Z \gt -1) = P(Z \lt 1) = 0.5 + P(0 \lt Z \lt 1)

P(Z \gt -1) = 0.5 + 0.3413 = 0.8413

The probability that a randomly chosen male is taller than 1.69\text{ m} is 0.8413.

Now, multiply this probability by the number of males in the group (700):

\displaystyle \text{Expected number}\displaystyle \approx\displaystyle 0.8413 \cdot 700Multiply probability by total number
\displaystyle =\displaystyle 588.91Evaluate

Rounding to the nearest whole number, we expect approximately 589 males to be taller than 1.69 \text{ m}.

Example 4

IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. Using the Empirical Rule(68-95-99.7), estimate the percentage of people with an IQ score between 70 and130.

Worked Solution
Create a strategy

The Empirical Rule states that approximately 68\% of data falls within 1 standard deviation, 95\% within 2 standard deviations, and 99.7\% within 3 standard deviations of the mean in a normal distribution.

First, determine how many standard deviations away from the mean the values 70 and 130 are. Then, apply the appropriate percentage from the Empirical Rule.

A graph of a normal curve with standard deviations labeled from -3 to positive 3. Area between -2 and positive 2 is labeled 95% in this region. Area between -1 and positive 1 is labeled 68% in this region.
Apply the idea

The mean (\mu) is 100 and the standard deviation (\sigma) is 15.

Consider the lower value, 70: The difference from the mean is 70 - 100 = -30. Divide this by the standard deviation: \dfrac{-30}{15} = -2. So, 70 is 2 standard deviations below the mean (z=-2).

Consider the upper value, 130: The difference from the mean is 130 - 100 = 30. Divide this by the standard deviation: \dfrac{30}{15}= 2. So, 130 is 2 standard deviations above the mean (z=2).

The range between 70 and 130 corresponds to the range within 2 standard deviations of the mean (\mu \pm 2\sigma, or -2 \leq z \leq 2).

According to the Empirical Rule, approximately 95\% of the data falls within 2 standard deviations of the mean.

Therefore, about 95\% of people have an IQ score between 70 and 130.

Example 5

Using the provided z-table (area from 0 to z), find the probability P(Z \gt 1.25). Give your answer to four decimal places.

A table showing the area under the standard normal curve between 0 and a given z-score. Rows are tenths of z-scores, and columns are hundredths.
Worked Solution
Create a strategy
Standard normal curve showing the area right of z=1.25 shaded. Calculation indicated as 0.5 minus the area from 0 to 1.25.

\\

We need to find the area under the standard normal curve to the right of z = 1.25. The table gives the area between z=0 and z=1.25. The total area to the right of the mean (z=0) is 0.5. To find the area to the right of z=1.25, we subtract the table value (area from 0 to 1.25) from 0.5.

Apply the idea

First, find the area corresponding to z=1.25 in the table. Look for the row 1.2 and the column 0.05. The table value is 0.3944. This is the area between z=0 and z=1.25.

The positive z score table. The value 0.3944 is circled and it is on the 1.2 row, 0.05 column.

We want the area to the right of z=1.25, so we subtract the table value from 0.5:

P(Z > 1.25) = 0.5 - P(0 \lt Z \lt 1.25)

P(Z > 1.25) = 0.5 - 0.3944 = 0.1056

The probability is 0.1056.

Example 6

Using the provided z-table (area from 0 to z), find the probability P(0.50 \lt Z \lt 1.50). Give your answer to four decimal places.

A table showing the area under the standard normal curve between 0 and a given z-score. Rows are tenths of z-scores, and columns are hundredths.
Worked Solution
Create a strategy

We need to find the area under the standard normal curve between z=0.50 and z=1.50. The table gives the area from the mean (z=0) to a specific z-score.

We can find the area from z=0 to z=1.50 using the table. We can also find the area from z=0 to z=0.50 using the table.

The area between z=0.50 and z=1.50 is the difference between these two areas: (Area from 0 to 1.50) - (Area from 0 to 0.50).

A standard normal curve showing the area between z=0.50 and z=1.50 shaded. The calculation is shown as the area from 0 to 1.50 minus the area from 0 to 0.50.
Apply the idea
The positive z score table. Values 0.4332 (1.5 row, 0.00 column) and 0.1915 (0.5 row, 0.00 column) are circled.

\\

Find the area for z=1.50: Row 1.5, Column 0.00. Table value = 0.4332. This is P(0 \lt Z \lt 1.50).

Find the area for z=0.50: Row 0.5, Column 0.00. Table value = 0.1915. This is P(0 \lt Z \lt 0.50).

Subtract the smaller area from the larger area:

P(0.50 \lt Z \lt 1.50) = P(0 \lt Z \lt 1.50) - P(0 \lt Z \lt 0.50)

P(0.50 \lt Z \lt 1.50) = 0.4332 - 0.1915 = 0.2417

The probability is 0.2417.

Idea summary

Probabilities for a standard normal distribution are represented by areas under its curve. The total area is 1.

A z-table helps find these areas. The table used here gives the area between the mean (z=0) and a positive z-score.

To find probabilities (areas) using this type of table:

  • P(Z \lt z) (area left of positive z): 0.5 + \text{TableValue}(z)
  • P(Z \gt z) (area right of positive z): 0.5 - \text{TableValue}(z)
  • P(Z \lt -z) (area left of negative z): Use symmetry, same as P(Z \gt z), so 0.5 - \text{TableValue}(z).
  • P(Z \gt -z) (area right of negative z): Use symmetry, same as P(Z \lt z), so 0.5 + \text{TableValue}(z).
  • P(a \lt Z \lt b): Often involves adding or subtracting table values based on the signs of a and b.

We need to determine the z-score first if starting with raw data, using z=\dfrac{x-\mu}{\sigma}.

Determine normal distribution probability using a calculator

Graphing calculators and statistical software provide functions to calculate normal distribution probabilities directly, often more accurately and quickly than using tables. A common function is the Normal Cumulative Distribution Function (often abbreviated as normCdf, normalCdf, or similar).

This function calculates the area under the normal curve between a specified lower boundary and upper boundary for a given mean (\mu) and standard deviation(\sigma).

For the standard normal distribution, we always use:

  • Mean (\mu) = 0
  • Standard Deviation (\sigma) = 1

To calculate different types of probabilities using normCdf(lower bound, upper bound, mean, standard deviation):

  • Probability less than a value a(P(Z \lt a)):
    • Lower bound: A very small negative number (e.g., -10^{99}, -1E99, or -9999 depending on calculator input style). (This acts like negative infinity, covering everything far to the left).
    • Upper bound: a
    • Mean: 0
    • Standard deviation: 1
    • Example syntax: normCdf(-1E99, a, 0, 1)
  • Probability greater than a value b(P(Z \gt b)):
    • Lower bound: b
    • Upper bound: A very large positive number (e.g., 10^{99}, 1E99, or 9999). (This acts like positive infinity, covering everything far to the right).
    • Mean: 0
    • Standard deviation: 1
    • Example syntax: normCdf(b, 1E99, 0, 1)
  • Probability between two values a and b (P(a \lt Z \lt b)):
    • Lower bound: a
    • Upper bound: b
    • Mean: 0
    • Standard deviation: 1
    • Example syntax: normCdf(a, b, 0, 1)

Consult the calculator's manual or help resources to find the exact location and syntax for its normal distribution functions.

Examples

Example 7

Using a calculator, find the area under the standard normal curve between 1.30 and 1.70 standard deviations above the mean. Give your answer to four decimal places.

Worked Solution
Create a strategy

We want to find the area under the standard normal curve between z=1.30 and z=1.7. We will use the calculator's normal cumulative distribution function (normCdf).

The inputs required are the lower z-score bound, the upper z-scorebound, the mean (\mu=0 for standard normal), and the standard deviation (\sigma=1 for standard normal).

Lower bound = 1.30

Upper bound = 1.70

Mean = 0

Standard Deviation = 1

Apply the idea

Using the calculator's normCdf function with the identified inputs:

normCdf(lower: 1.30, upper: 1.70, mean: 0, sd: 1)

The calculator returns a value of approximately 0.0522.

P(1.30 \lt Z \lt 1.70) \approx 0.0522

The area, representing the probability, is approximately 0.0522 when rounded to four decimal places.

Example 8

Using a calculator, find the probability P(Z \lt -0.78). Give your answer to four decimal places.

Worked Solution
Create a strategy

We want to find the area under the standard normal curve to the left of z = -0.78. We will use the calculator's normal cumulative distribution function (normCdf).

Inputs for normCdf(lower, upper, mean, sd):

Lower bound: A very small number representing negative infinity (e.g., -1E99).

Upper bound: -0.78

Mean = 0

Standard Deviation = 1

Apply the idea

Using the calculator's normCdf function:

normCdf(lower: -1E99, upper: -0.78, mean: 0, sd: 1)

(Note: A calculator might use -10^{99} or -9999 for the lower bound).

The calculator returns a value of approximately 0.2177.

P(Z \lt -0.78) \approx 0.2177

The probability is approximately 0.2177 when rounded to four decimal places.

Example 9

Using a calculator, find the probability P(Z \gt 1.05). Give your answer to four decimal places.

Worked Solution
Create a strategy

We want to find the area under the standard normal curve to the right of z = 1.05. We will use the calculator's normal cumulative distribution function (normCdf).

Inputs for normCdf(lower, upper, mean, sd):

Lower bound: 1.05

Upper bound: A very large number representing positive infinity (e.g., 1E99).

Mean = 0

Standard Deviation = 1

Apply the idea

Using the calculator's normCdf function:

normCdf(lower: 1.05, upper: 1E99, mean: 0, sd: 1)

(Note: A calculator might use 10^{99} or 9999 for the upper bound).

The calculator returns a value of approximately 0.1469.

P(Z \gt 1.05) \approx 0.1469

The probability is approximately 0.1469 when rounded to four decimal places.

Idea summary

Graphing calculators use functions like normCdf(lower bound, upper bound, mean, standard deviation) to find probabilities (areas) under the normal curve.

For the standard normal distribution, always use mean \mu=0 and standard deviation \sigma=1.

Set the lower and upper bounds appropriately for the desired probability:

  • P(Z \lt a): lower = -1E99, upper = a
  • P(Z \gt b): lower = b, upper = 1E99
  • P(a \lt Z \lt b): lower = a, upper = b

The calculator output directly gives the area under the curve, which represents the probability.

Outcomes

AFDA.DA.4f

Represent probability as the area under the curve of a standard normal distribution.

AFDA.DA.4g

Determine probabilities associated with areas under the standard normal curve, using technology or a table of Standard Normal Probabilities.

What is Mathspace

About Mathspace