Python Pandas – DataFrame

Creating a DataFrame in Pandas

A DataFrame in Python’s pandas library is a two-dimensional labeled data structure that is used for data manipulation and analysis. It can handle different data types such as integers, floats, and strings. Each column has a unique label, and each row is labeled with a unique index value, which helps in accessing specific rows.

DataFrame is used in machine learning tasks which allow the users to manipulate and analyze the data sets in large size. It supports the operations such as filtering, sorting, merging, grouping and transforming data.

Features of DataFrame

Columns can be of different types.
Size is mutable.
Labeled axes (rows and columns).
Can Perform Arithmetic operations on rows and columns.

Python Pandas DataFrame Structure

You can think of a DataFrame as similar to an SQL table or a spreadsheet data representation. Let us assume that we are creating a data frame with student’s data.

Creating a pandas DataFrame

A pandas DataFrame can be created using the following constructor −

pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=None)

The parameters of the constructor are as follows −

Sr.No	Parameter & Description
1	data data takes various forms like ndarray, series, map, lists, dict, constants and also another DataFrame.
2	index For the row labels, the Index to be used for the resulting frame is Optional Default np.arange(n) if no index is passed.
3	columns This parameter specifies the column labels, the optional default syntax is – np.arange(n). This is only true if no index is passed.
4	dtype Data type of each column.
5	copy This command (or whatever it is) is used for copying of data, if the default is False.

Creating a DataFrame from Different Inputs

A pandas DataFrame can be created using various inputs like −

Lists
Dictionary
Series
Numpy ndarrays
Another DataFrame
External input files like CSV, JSON, HTML, Excel sheet, and more.

Create an Empty DataFrame

#import the pandas library and aliasing as pd
import pandas as pd
df = pd.DataFrame()
print(df)

Output:

Empty DataFrame
Columns: []
Index: []

Create a DataFrame from Lists

import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print(df)

Output:

Vista Academy – 316/336, Park Rd, Laxman Chowk, Dehradun – 248001
📞 +91 94117 78145 | 📧 thevistaacademy@gmail.com | 💬 WhatsApp

💬 Chat on WhatsApp: Ask About Our Courses