Table of Contents
ToggleThe Pandas library provides a powerful tool for data manipulation and analysis. Among its many features, it offers the ability to read and write data to Excel files easily. Excel files are widely used to store and organize data in tabular format, making them an excellent choice for analysis. Excel files can have multiple sheets, each containing rows and columns of data. Excel files usually come with extensions like .xls or .xlsx.
Pandas provides easy-to-use methods for working with Excel data directly in Python. One such method is the read_excel() method, which reads Excel files and loads the data into a Pandas DataFrame.
.xlsx) files can be read using the openpyxl Python module..xls) files can be read using the xlrd module..xlsb) files can be read using the pyxlsb module.Note: Make sure the required packages are installed:
pip install openpyxl
pip install xlrd
The pandas.read_excel() method reads Excel files and loads the data into a Pandas DataFrame.
Example:
import pandas as pd
# Read an Excel file
df = pd.read_excel('data.xlsx')
# Print the DataFrame
print('Output DataFrame:')
print(df)
Output:
Output DataFrame:
Sr.no Name Gender Age
0 1 Braund female 38
1 2 Cumings male 22
2 3 Heikkin female 35
3 4 Futrelle female 26
To read a specific sheet, use the sheet_name parameter.
Example:
import pandas as pd
# Read a specific sheet
df = pd.read_excel('data.xlsx', sheet_name="Sheet_2")
# Print the DataFrame
print('Output DataFrame:')
print(df)
Output:
Output DataFrame:
Name Value
0 string1 1
1 string2 3
2 Comment 5
You can read multiple sheets by passing a list of names or indices to sheet_name.
Example:
import pandas as pd
# Read multiple sheets
df = pd.read_excel('data.xlsx', sheet_name=[0, 1])
# Print the DataFrame
print('Output Dict of DataFrames:')
print(df)
Output:
Output Dict of DataFrames:
{0: Sr.no Name Gender Age
0 1 Braund female 38
1 2 Cumings male 22
2 3 Heikkin female 35
3 4 Futrelle female 26,
1: Name Value
0 string1 1
1 string2 3
2 Comment 5}
MultiIndex is useful for hierarchical data. Use index_col and header parameters to control this.
Example:
import pandas as pd
# Create a MultiIndex object
index = pd.MultiIndex.from_tuples([('A', 'one'), ('A', 'two'), ('B', 'one'), ('B', 'two')])
# Create a DataFrame
data = [[1, 2], [3, 4], [5, 6], [7, 8]]
df = pd.DataFrame(data, index=index, columns=['X', 'Y'])
df.to_excel("multiindex_data.xlsx")
# Read MultiIndex rows and columns
df = pd.read_excel("multiindex_data.xlsx", index_col=[0, 1])
print('Output DataFrame from Excel File:')
print(df)
Output:
Output DataFrame from Excel File:
X Y
A one 1 2
two 3 4
B one 5 6
two 7 8
Content powered by Vista Academy.
