The Most Comprehensive NumPy Tutorial for Data Science Beginners

The Most Comprehensive NumPy Tutorial for Data Science Beginners

NumPy is a key Python library that every data scientist should be familiar with. This thorough NumPy tutorial walks you through the basics of NumPy, from basic mathematical operations to how Numpy interacts with picture data.

Before we start the Numpy course, it’s important to note that when individuals first start working with NLP (Natural Language Processing), they utilise default Python lists

They eventually switch to Numpy, though. This is due to the fact that larger experiments with a lot of data aren’t necessarily compatible with standard Python lists. Numpy comes in handy when Python lists use too much memory.

Now you need to import the library:

Implementing Numpy in PyCharm

import numpy as np

np is the de facto abbreviation for NumPy used by the data science community.

NumPy Array

A Numpy Array, also known as a Numpy matrix, is a two-dimensional array with rows and columns. A NumPy array with four columns and three rows is shown below.

numpy array

NumPy Arrays vs. Python Lists — What Is the Distinction?

If you know Python, you might be questioning why we need NumPy arrays since we already have Python lists. After all, these Python lists function as an array that may store numerous sorts of items. This is an excellent question, and the solution lies in the way Python keeps objects in memory.

A Python object is essentially a reference to a memory region that contains all of the object’s characteristics, such as bytes and value. Although this extra information is what makes Python a dynamically typed language, it comes at a cost, which is evident when keeping a big collection of objects, such as in an array.

Python lists are just an array of pointers, each referring to a different item in the list.

Python lists are basically an array of pointers, each pointing to location containing the element’s information. 
This adds significant amount of memory and calculation overhead. 
When all of the items in the list are of the same type, most of this information is rendered irrelevant!

To get around this, we utilise NumPy arrays with only main emphasis, that is, items of the same data type. 
This improves the array’s storage and manipulation efficiency. 
When the array has big number of elements, such as hundreds or millions, this difference becomes obvious. 
You can also execute element-wise operations with NumPy arrays, which is not feasible with Python lists!

Creating a NumPy Array

Given the complexity of the issues they tackle, NumPy arrays are remarkably simple to generate. The np.array() technique is used to generate a very simple ndarray. All you have to do is pass the array’s values as a list:




array([1., 2., 3., 4.], dtype=float32)

Since NumPy arrays can contain only homogeneous datatypes, values will be upcast if the types do not match:



array([1., 2., 3., 4.])

Here, NumPy has upcast integer values to float values.

NumPy arrays can be multi-dimensional too.


array([[1, 2, 3, 4],

       [5, 6, 7, 8]])

 Here, we created a 2-dimensional array of values.
Note: A matrix is just a rectangular array of numbers with shape N x M where N is the number of rows and M is the number of columns in the matrix. The one you just saw above is a 2 x 4 matrix.

(i) Numpy Tutorial: How to create a NumPy matrix in the same shape but as a different array. This uses the function NumPy.empty_like():

# Creating ndarray from list
c = np.array([[1., 2.,],[1., 2.]])

# Creating new array in the shape of c, filled with 0
d = np.empty_like(c)

(ii) Then you cast the NumPy array from the Python list using the function NumPy.asarray() :

import NumPy as np

list = [1, 2, 3]
c = np.asarray(list)

(iii) You can also create a customized ndarray in the required size. You can fill it with random values, ones, or zeroes.

# Array items as ndarray
c = np.array([1, 2, 3])

# A 2×2 2d array shape for the arrays in the format (rows, columns)
shape = (2, 2)

# Random values
c = np.empty(shape)

d = np.ones(shape)
e = np.zeros(shape)

Array of zeros

The np.zeros() function in NumPy allows you to build an array of all zeros. All you have to do is supply the required array’s shape:

array([0., 0., 0., 0., 0.])
The one above is a 1-D array while the one below is a 2-D array:

array([[0., 0., 0.],
[0., 0., 0.]])

Array of ones
You could also use the np.ones() function to construct an array of all 1s:

How to Merge Arrays with Numpy Python

Rather than merging arrays, you should build an array of the desired size and fill it. This is due to the fact that merging arrays only results in the construction of a large array and the copying of the contents into it.

Use these routines instead if you need to combine arrays.


1d arrays:


a = np.array([1, 2, 3])
b = np.array([5, 6])
print np.concatenate([a, b, b])
# >> [1 2 3 5 6 5 6]


2d arrays:

a2 = np.array([[1, 2], [3, 4]])

# axis=0 – concatenate along rows
print np.concatenate((a2, b), axis=0)
# >> [[1 2]
# [3 4]
# [5 6]]

# axis=1 – concatenate along columns, but first b needs to be transposed:
#>> [[5]
# [6]]
np.concatenate((a2, b.T), axis=1)
#>> [[1 2 5]
# [3 4 6]]


1d arrays:

# 1d arrays
print np.append(a, a2)
# >> [1 2 3 1 2 3 4]

print np.append(a, a)
# >> [1 2 3 1 2 3]

2d arrays.

pend(a2, b, axis=0)
# >> [[1 2]
# [3 4]
# [5 6]]

print np.append(a2, b.T, axis=1)
# >> [[1 2 5]
# [3 4 6]]

Python NumPy Operations


You can determine the array’s dimension, whether it’s a two-dimensional or single-dimensional array. So, let’s look at how we may find the dimensions in practise. I can determine whether the array is single or multidimensional using the ndim function in the code below.


import numpy as np


a = np.array([(1,2,3),(4,5,6)])



Output – 2
It’s two-dimension array because the output is 2. (multi dimension).



Each element’s byte size can be calculated. I’ve defined a single-dimensional array in the code below, and we can discover the size of each element using the ‘itemsize’ function.


import numpy as np


a = np.array([(1,2,3)])



Output – 4

In the numpy array above, each element takes up 4 bytes.


Hstack (stack horizontally) and vstack (stack vertically)

1d arrays:

print np.hstack([a, b])
# >> [1 2 3 5 6]

print np.vstack([a, a])
# >> [[1 2 3]
# [1 2 3]]

2d arrays:

print np.hstack([a2,a2]) # arrays must match shape
# >> [[1 2 1 2]
# [3 4 3 4]]

print np.vstack([a2, b])
# >> [[1 2]
# [3 4]
# [5 6]]


Linspace delivers numbers that are evenly spaced throughout a given interval. For example:


import numpy as np





Output – [ 1. 1.22222222 1.44444444 1.66666667 1.88888889 2.11111111 2.33333333 2.55555556 2.77777778 3

Square Root and Standard Deviation in Numpy

The square root function returns the square root of each and every output element. You may also get the standard deviation. Let’s see what happens.


import numpy as np







Max/Min Numpy Tutorial

This function is useful for determining the minimum and maximum values of a NumPy array.


import numpy as np


a= np.array([1,2,3])








Numpy Tutorial: Python Numpy Special Functions

Mathematical functions like sine, tan, cos, log, etc can also be used you can use in NumPy. We can plot the sine, cos, tan function by importing Matplotlib. Here’s what it looks like for the sine function:


import numpy as np


import matplotlib.pyplot as plt


x= np.arange(0,3*np.pi,0.1)







Final thought

NumPy is an excellent tool for emerging data scientists who want to execute more complex operations with big volumes of data than the basic Python lists.

There are many more operations that may be performed with these Python Tools that are not covered in this Numpy lesson. You can progress to more sophisticated operations once you’ve mastered the NumPy basics.


Are the possibilities presented by Python inspiring you as well? You can also enrol in a Data Science Course for more profitable career opportunities in Python Data Science.

10 steps to start career in data science 5 Data Analytics Projects for Beginners 5 Excel Data Analysis Functions You Need to Know 5 Things in Your Resume from Getting Your First Job in Data Science Best Data Analytics training in Dehradun Why to learn Best Data science Training in Dehradun Categories of SQL command to know for Data Analysis Data Analyst vs. Business Analyst: What’s the Difference? Data Cleaning With Python for data analytics Data Science Case Studies given by Top Data Scientists