Pandas is an open-source library designed primarily for working quickly and logically with relational or labelled data. It offers a range of data structures and procedures for working with time series and numerical data. The NumPy library serves as the foundation for this library. Pandas is quick and offers its users exceptional performance & productivity.
Checking to see if pandas is installed in the Python folder is the first step in using it. If not, we must use the pip command to install it on our machine. Enter the command cmd in the search box, and then use the cd command to find the location where the python-pip file is installed. Locate it and enter the following command:
Β
You must import the library after installing pandas on your computer. Typically, this module is imported as:
bring in pandas as pets
Pd is used as a shorthand for the Pandas in this sentence. Although it is helpful to write less code each time a method or property is called, utilising the alias to import the library is not required.
In general, Pandas offers two data structures for data manipulation, namely:
Series
DataFrame
Β
Table of Contents
Togglepip install pandas
import pandas as pd
# Creating a Series from a list data = [1, 2, 3, 4, 5] series = pd.Series(data) print(series)DataFrame: A two-dimensional table with rows and columns, similar to a spreadsheet or a SQL table
# Creating a DataFrame from a dictionary
data = {‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’],
‘Age’: [25, 30, 35]}
df = pd.DataFrame(data)
print(df)
# Reading a CSV file into a DataFrame df = pd.read_csv(‘data.csv’)Loading data from an Excel file:
# Reading an Excel file into a DataFrame df = pd.read_excel(‘data.xlsx’)
# Display the first 5 rows print(df.head())Getting information about the DataFrame: # Display basic information about the DataFrame
print(df.info())Descriptive statistics of the data:
# Generate summary statistics print(df.describe())
