Table of Contents
ToggleIn statistics, averages provide a measure of the center of the data. They help summarize large sets of numbers and can offer insight into the general trend of the data. There are several types of averages, the most common of which are:
The mean is often referred to as the average. To calculate the mean, you add all the values in a dataset and then divide by the number of values. It gives you a general idea of the “center” of the data.
For example, if you have the following dataset of ages: 40, 21, 55, 21, 48, 13, 72, you calculate the mean as:
The mean gives a rough idea of the data’s central value, but it can be influenced heavily by extreme values (outliers). For example, if you change the highest value to a much larger number (say 356), the mean will change significantly:
As you can see, the mean has shifted substantially, demonstrating its sensitivity to outliers.
The median is the middle value of a dataset when the values are ordered from least to greatest. If there is an odd number of values, the median is the one in the middle. If there is an even number, the median is the average of the two middle values.
For example, with the dataset 13, 21, 21, 40, 48, 55, 72, we can see that the middle value is 40. The median is 40 because it’s the middle number when the data is arranged in order.
Even if we change the highest value in the dataset to 356, the median doesn’t change because the middle value remains the same:
So, the median remains 40 even with the large outlier. This makes the median more resistant to extreme values than the mean.
The mode is the value that appears most frequently in the dataset. If multiple values occur the same number of times, the dataset has multiple modes.
Consider the dataset 40, 21, 55, 21, 48, 13, 72. The mode is 21 because it appears twice, whereas all other numbers appear only once.
If you change the dataset to 40, 21, 55, 21, 40, 13, 72, now the mode is both 21 and 40 because each of them appears twice, while the rest appear only once. This is an example of a dataset with two modes, making it bimodal.
