In the Python Pandas library, a Series is one of the primary data structures that offers a convenient way to handle and manipulate one-dimensional data. It is similar to a column in a spreadsheet or a single column in a database table. In this tutorial by Vista Academy, you will learn more about Pandas Series and use Series effectively for data manipulation and analysis.
A Series in Pandas is a one-dimensional labeled array capable of holding data of any type, including integers, floats, strings, and Python objects. It consists of two main components:
A Series is similar to a one-dimensional ndarray (NumPy array) but with labels, which are also known as indices. These labels can be used to access the data within the Series. By default, the index values are integers starting from 0 to the length of the Series minus one, but you can also manually set the index labels.
A pandas Series can be created using the following constructor:
class pandas.Series(data, index, dtype, name, copy)
The parameters of the constructor are as follows:
| Sr.No | Parameter & Description |
|---|---|
| 1 | data: Data takes various forms like ndarray, list, or constants. |
| 2 | index: Index values must be unique and hashable, with the same length as data. Default is np.arange(n) if no index is passed. |
| 3 | dtype: Data type. If None, data type will be inferred. |
| 4 | copy: Copy data. Default is False. |
A Series object can be created using various inputs like:
If no data is provided to the Series constructor pandas.Series(), it will create a basic empty series object.
#import the pandas library and aliasing as pd
import pandas as pd
s = pd.Series()
# Display the result
print('Resultant Empty Series:\n', s)
Output:
Resultant Empty Series: Series([], dtype: object)
If an ndarray is provided as input data to the Series constructor, it creates a Series with that data. You may also specify a custom index of the same length as the data. If no index is specified, Pandas will automatically assign one.
#import the pandas library and aliasing as pd import pandas as pd import numpy as np data = np.array(['a','b','c','d']) s = pd.Series(data) print(s)
Output:
0 a 1 b 2 c 3 d dtype: object
Custom Index Example:
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data, index=[100,101,102,103])
print("Output:\n", s)
Output:
Output: 100 a 101 b 102 c 103 d dtype: object
A dictionary can be passed to the pd.Series() constructor to create a Series. If no index is specified, the dictionary keys are used in sorted order. If an index is provided, only corresponding dictionary values will be pulled and missing ones filled with NaN.
Example 1: Without specifying index
import pandas as pd
data = {'a': 0., 'b': 1., 'c': 2.}
s = pd.Series(data)
print(s)
Output:
a 0.0 b 1.0 c 2.0 dtype: float64
Example 2: With custom index
import pandas as pd
data = {'a': 0., 'b': 1., 'c': 2.}
s = pd.Series(data, index=['b', 'c', 'x', 'a'])
print(s)
Output:
b 1.0 c 2.0 x NaN a 0.0 dtype: float64
If you provide a single scalar value and a list of indices, Pandas will broadcast that scalar to all specified indices.
import pandas as pd s = pd.Series(5, index=[0, 1, 2, 3]) print(s)
Output:
0 5 1 5 2 5 3 5 dtype: int64
