Table of Contents
TogglePandas DataFrame slicing is a process of extracting specific rows, columns, or subsets of data based on both position and labels. DataFrame slicing is a common operation while working with large datasets, and it is similar to Python lists and NumPy ndarrays. DataFrame slicing uses the [] operator and specific slicing attributes like .iloc[] and .loc[] to retrieve data efficiently.
Pandas DataFrame slicing is performed using two main attributes, which are:
The Pandas DataFrame.iloc[] attribute is used to slice a DataFrame based on the integer position (i.e., integer-based indexing) of rows and columns.
import pandas as pd
# Create a Pandas DataFrame
df = pd.DataFrame([["a","b"], ["c","d"], ["e","f"], ["g","h"]], columns=['col1', 'col2'])
# Display the DataFrame
print("Input DataFrame:")
print(df)
# Slice rows based on position
result = df.iloc[1:3, :]
print("Output:")
print(result)
Input DataFrame: col1 col2 0 a b 1 c d 2 e f 3 g h Output: col1 col2 1 c d 2 e f
The Pandas DataFrame.loc[] attribute is used to slice a DataFrame based on the labels of rows and columns.
import pandas as pd
# Create a DataFrame with labeled indices
df = pd.DataFrame([["a","b"], ["c","d"], ["e","f"], ["g","h"]], columns=['col1', 'col2'], index=['r1', 'r2', 'r3', 'r4'])
# Display the DataFrame
print("Original DataFrame:")
print(df)
# Slice rows and columns by label
result = df.loc['r1':'r3', 'col1']
print("Output:")
print(result)
Original DataFrame: col1 col2 r1 a b r2 c d r3 e f r4 g h Output: r1 a r2 c r3 e Name: col1, dtype: object
import pandas as pd
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)
# Slice a single column
col_A = df.iloc[:, 0]
print("Slicing a single column A using iloc[]:")
print(col_A)
# Slice multiple columns
cols_AB = df.iloc[:, 0:2]
print("Slicing multiple columns A and B using iloc[]:")
print(cols_AB)
Slicing a single column A using iloc[]: 0 1 1 2 2 3 Name: A, dtype: int64 Slicing multiple columns A and B using iloc[]: A B 0 1 4 1 2 5 2 3 6
import pandas as pd
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)
# Slice a single column by label
col_A = df.loc[:, 'A']
print("Slicing a single column A using loc[]:")
print(col_A)
# Slice multiple columns by label
cols_AB = df.loc[:, 'A':'B']
print("Slicing Multiple columns A and B using loc[]:")
print(cols_AB)
Slicing a single column A using loc[]: 0 1 1 2 2 3 Name: A, dtype: int64 Slicing Multiple columns A and B using loc[]: A B 0 1 4 1 2 5 2 3 6
After slicing a DataFrame, you can modify the sliced values directly by assigning new values to the selected elements.
import pandas as pd
# Create a DataFrame
df = pd.DataFrame([["a", "b"], ["c", "d"], ["e", "f"], ["g", "h"]],
columns=['col1', 'col2'])
# Display the Original DataFrame
print("Original DataFrame:", df, sep='\n')
# Modify a subset of the DataFrame using iloc
df.iloc[1:3, 0] = ['x', 'y']
# Display the modified DataFrame
print('Modified DataFrame:',df, sep='\n')
Original DataFrame: col1 col2 0 a b 1 c d 2 e f 3 g h Modified DataFrame: col1 col2 0 a b 1 x d 2 y f 3 g h
