Table of Contents
ToggleA DataFrame is a structured representation of data, widely used in Data Science for data manipulation and analysis.
Let’s create a DataFrame with 3 columns and 5 rows of fictional data using the Pandas library.
import pandas as pd
d = {'col1': [1, 2, 3, 4, 7], 'col2': [4, 5, 6, 9, 5], 'col3': [7, 8, 12, 1, 11]}
df = pd.DataFrame(data=d)
print(df)
import pandas as pd.d with column names as keys and lists of numbers as values.pd.DataFrame() to convert the dictionary into a structured DataFrame.print() function to display the DataFrame.
Remember to use pd. in front of DataFrame() to specify that the function belongs to the Pandas library. Note the capital “D” and “F” in DataFrame.
When printed, the DataFrame will display columns (e.g., col1, col2, col3) and rows indexed starting from zero. The vertical numbers (0–4) indicate the position of each row.
Use the shape attribute of the DataFrame to find the number of columns and rows:
# Count the number of columns
count_column = df.shape[1]
print(count_column)
# Count the number of rows
count_row = df.shape[0]
print(count_row)
The df.shape attribute returns a tuple with the number of rows and columns. df.shape[0] gives the row count, and df.shape[1] gives the column count.
Understanding DataFrames is crucial for data analysis and manipulation in Python.
