Table of Contents
ToggleA normal distribution, also known as the Gaussian distribution, is a continuous probability distribution that is symmetric around its mean, indicating that data near the mean are more frequent in occurrence than data far from the mean.
The shape of the normal distribution is described by its mean (μ) and standard deviation (σ). The mean determines the center of the distribution, while the standard deviation controls the spread of the data.
NumPy provides the numpy.random.normal() function to generate samples from a normal distribution. This function allows you to specify the mean, standard deviation, and size of the generated samples.
In this example, we generate 10 random samples from a normal distribution with a mean of 0 and a standard deviation of 1 −
# Open Compiler
import numpy as np
# Generate 10 random samples from a normal distribution with mean 0 and standard deviation 1
samples = np.random.normal(0, 1, 10)
print("Random samples from normal distribution:", samples)
Output:
Following is the output obtained −
Random samples from normal distribution: [ 1.45958315 -1.47376803 0.86885907 0.28076705 -2.16173553 -0.43457503 0.47706858 0.65894456 0.56166159 -0.71025105]
Visualizing normal distributions helps to understand their properties better. We can use libraries such as Matplotlib to create histograms that display the distribution of generated samples.
In the following example, we are generating 1000 random samples from a normal distribution with mean 0 and standard deviation 1 and then create a histogram to visualize this distribution −
import numpy as np
import matplotlib.pyplot as plt
# Generate 1000 random samples from a normal distribution with mean 0 and standard deviation 1
samples = np.random.normal(0, 1, 1000)
# Create a histogram to visualize the distribution
plt.hist(samples, bins=30, edgecolor='black', density=True)
# Plot the probability density function (PDF)
x = np.linspace(-4, 4, 1000)
pdf = 1/(np.sqrt(2 * np.pi)) * np.exp(-x**2 / 2)
plt.plot(x, pdf, 'r', linewidth=2)
plt.title('Normal Distribution')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.show()
Output:
The histogram shows that the samples follow a bell-shaped curve, which is characteristic of a normal distribution. The red line represents the theoretical probability density function (PDF) of the normal distribution −
Normal Distribution
Normal distributions are used in various fields, including statistics, finance, engineering, and the natural and social sciences. Here are a few practical applications:
NumPy also allows generating samples from a multivariate normal distribution using the numpy.random.multivariate_normal() function. This function generates samples from a multivariate normal distribution with a specified mean vector and covariance matrix.
In this example, we generate 1000 random samples from a multivariate normal distribution with a specified mean vector and covariance matrix −
# Open Compiler
import numpy as np
# Define the mean vector and covariance matrix
mean = [0, 0]
cov = [[1, 0.5], [0.5, 1]]
# Generate 1000 random samples from a multivariate normal distribution
samples = np.random.multivariate_normal(mean, cov, 1000)
print("Random samples from multivariate normal distribution:", samples[:5])
Output:
The output obtained is as shown below −
Random samples from multivariate normal distribution:
[[-0.13543463 1.3100422 ]
[-1.46447528 -0.42485422]
[ 0.31941286 -0.33503219]
[ 0.86726151 1.43161159]
[ 0.12539345 -1.72856329]]
Normal distributions have several key properties, they are −
The standard normal distribution is a special case of the normal distribution with a mean of 0 and a standard deviation of 1. It is often used as a reference distribution. You can generate samples from a standard normal distribution using the numpy.random.standard_normal() function.
In this example, we generate 10 random samples from a standard normal distribution −
# Open Compiler
import numpy as np
# Generate 10 random samples from a standard normal distribution
samples = np.random.standard_normal(10)
print("Random samples from standard normal distribution:", samples)
Output:
The result produced is as follows −
Random samples from standard normal distribution: [ 0.41271088 -0.06102183 -0.48159376 0.63379932 -0.41831826 -0.67104197 0.2019988 0.52954154 -0.39241029 -0.19626287]
To ensure reproducibility, you can set a specific seed before generating normal distributions. This ensures that the same sequence of random numbers is generated each time you run the code.
By setting the seed, you ensure that the random generation produces the same result every time the code is executed as shown in the example below −
# Open Compiler
import numpy as np
# Set the seed for reproducibility
np.random.seed(42)
# Generate 10 random samples from a normal distribution with mean 0 and standard deviation 1
samples = np.random.normal(0, 1, 10)
print("Random samples with seed 42:", samples)
Output:
We get the output as shown below −
Random samples with seed 42: [ 0.49671415 -0.1382643 0.64768854 1.52302986 -0.23415337 -0.23413696 1.57921282 0.76743473 -0.46947439 0.54256004]
Key Takeaway: Master normal distributions with NumPy at Vista Academy!
