Data Science – Python DataFrame

Table of Contents

A DataFrame is a structured representation of data, widely used in Data Science for data manipulation and analysis.

Creating a DataFrame with Pandas

Let’s create a DataFrame with 3 columns and 5 rows of fictional data using the Pandas library.

import pandas as pd

d = {'col1': [1, 2, 3, 4, 7], 'col2': [4, 5, 6, 9, 5], 'col3': [7, 8, 12, 1, 11]}

df = pd.DataFrame(data=d)

print(df)

Example Explained

Import the Pandas library: Use import pandas as pd.
Define data: Use a dictionary d with column names as keys and lists of numbers as values.
Create a DataFrame: Use pd.DataFrame() to convert the dictionary into a structured DataFrame.
Print the DataFrame: Use the print() function to display the DataFrame.

Remember to use pd. in front of DataFrame() to specify that the function belongs to the Pandas library. Note the capital “D” and “F” in DataFrame.

Interpreting the Output

When printed, the DataFrame will display columns (e.g., col1, col2, col3) and rows indexed starting from zero. The vertical numbers (0–4) indicate the position of each row.

Counting Columns and Rows

Use the shape attribute of the DataFrame to find the number of columns and rows:

# Count the number of columns
count_column = df.shape[1]
print(count_column)

# Count the number of rows
count_row = df.shape[0]
print(count_row)

The df.shape attribute returns a tuple with the number of rows and columns. df.shape[0] gives the row count, and df.shape[1] gives the column count.

Understanding DataFrames is crucial for data analysis and manipulation in Python.

Introduction to Data Science Written Edition English Tutorial

Curriculum

Data Science – Python DataFrame

Data Science – Python DataFrame

Creating a DataFrame with Pandas

Example Explained

Interpreting the Output

Counting Columns and Rows

Modal title