Table of Contents
ToggleA structured array in NumPy is an array where each element is a compound data type. This compound data type can consist of multiple fields, each with its own data type, similar to a table or a record.
For example, you can have an array where each element holds both a name (as a string) and an age (as an integer). This helps you to work with complex data more flexibly, as you can access and manipulate each field separately.
The first step in creating a structured array is defining the data type (dtype) that specifies the structure of each element. The dtype is defined as a list of tuples or a dictionary, where each tuple or dictionary entry defines a field name and its data type.
Following are the data types available in structured arrays −
'U10': Unicode string of length 10'i4': 4-byte integer'f8': 8-byte floating point number'b': Boolean valueYou can define the dtype and create the structured array using a list of tuples, where each tuple represents a field. Each tuple contains two elements: the first element is the name of the field, and the second element is the data type of that field.
In the following example, we are defining a structured array with fields for “name”, “age”, and “height” using a specified dtype. We then create this array with corresponding data −
# Open Compiler
import numpy as np
# Define the dtype
dtype = [('name', 'U10'), ('age', 'i4'), ('height', 'f4')]
# Define the data
data = [('Alice', 30, 5.6), ('Bob', 25, 5.8), ('Charlie', 35, 5.9)]
# Create the structured array
structured_array = np.array(data, dtype=dtype)
print("Structured Array:\n", structured_array)
Output:
Following is the output obtained −
Structured Array:
[(‘Alice’, 30, 5.6) (‘Bob’, 25, 5.8) (‘Charlie’, 35, 5.9)]
Alternatively, you can define the data and dtype using a dictionary to clearly specify the names and types of fields. Each key in the dictionary represents a field name, and the value associated with each key defines the data type of that field.
In this example, we are defining the dtype for a structured array using a dictionary format to specify fields for “name”, “age”, and “height”. We then create and display this structured array with the corresponding data, organizing it into a format that supports multiple data types within each record −
# Open Compiler
import numpy as np
# Define the dtype using a dictionary
dtype = np.dtype([('name', 'U10'), ('age', 'i4'), ('height', 'f4')])
# Define the data
data = [('Alice', 30, 5.6), ('Bob', 25, 5.8), ('Charlie', 35, 5.9)]
# Create the structured array
structured_array = np.array(data, dtype=dtype)
print("Structured Array from Dictionary:\n", structured_array)
Output:
This will produce the following result −
Structured Array from Dictionary:
[(‘Alice’, 30, 5.6) (‘Bob’, 25, 5.8) (‘Charlie’, 35, 5.9)]
You can access individual fields in a structured array using field names. This is done by indexing the array with the field name as a string.
In the example below, we are defining a structured array with fields for ‘name’, ‘age’, and ‘height’, and then accessing each of these fields separately −
# Open Compiler
import numpy as np
# Define a dtype and data for a structured array
dtype = [('name', 'U10'), ('age', 'i4'), ('height', 'f4')]
data = [('Alice', 30, 5.6), ('Bob', 25, 5.8), ('Charlie', 35, 5.9)]
structured_array = np.array(data, dtype=dtype)
# Access the 'name' field
names = structured_array['name']
print("Names:", names)
# Access the 'age' field
ages = structured_array['age']
print("Ages:", ages)
# Access the 'height' field
heights = structured_array['height']
print("Heights:", heights)
Output:
Following is the output of the above code −
Names: [‘Alice’ ‘Bob’ ‘Charlie’]
Ages: [30 25 35]
Heights: [5.6 5.8 5.9]
You can access specific rows of the structured array using indexing. This allows you to retrieve complete records. Here, we retrieve the first and second rows of the structured array −
# Open Compiler
import numpy as np
# Define a dtype and data for a structured array
dtype = [('name', 'U10'), ('age', 'i4'), ('height', 'f4')]
data = [('Alice', 30, 5.6), ('Bob', 25, 5.8), ('Charlie', 35, 5.9)]
structured_array = np.array(data, dtype=dtype)
# Access the first row
first_row = structured_array[0]
print("First Row:", first_row)
# Access the second row
second_row = structured_array[1]
print("Second Row:", second_row)
Output:
Following is the output of the above code −
First Row: (‘Alice’, 30, 5.6)
Second Row: (‘Bob’, 25, 5.8)
You can modify the values of individual fields in the structured array by indexing and assigning new values to them.
To add new fields to a structured array, you can use a combination of np.concatenate() function and creating a new dtype that includes the additional fields.
NumPy does not support adding fields directly to an existing structured array.
In the example below, we are updating the ‘age’ field of the first record in a structured array by directly assigning a new value −
# Open Compiler
import numpy as np
# Define a dtype and data for a structured array
dtype = [('name', 'U10'), ('age', 'i4'), ('height', 'f4')]
data = [('Alice', 30, 5.6), ('Bob', 25, 5.8), ('Charlie', 35, 5.9)]
structured_array = np.array(data, dtype=dtype)
# Update the 'age' of the first record
structured_array[0]['age'] = 31
print("Updated Structured Array:\n", structured_array)
Output:
The output obtained is as shown below −
Updated Structured Array:
[(‘Alice’, 31, 5.6) (‘Bob’, 25, 5.8) (‘Charlie’, 35, 5.9)]
Here, we are extending a structured array by adding a new field, ‘weight’, to its dtype and updating the data to include this field −
# Open Compiler
import numpy as np
# Define a dtype and data for the original structured array
dtype = [('name', 'U10'), ('age', 'i4'), ('height', 'f4')]
data = [('Alice', 30, 5.6), ('Bob', 25, 5.8), ('Charlie', 35, 5.9)]
structured_array = np.array(data, dtype=dtype)
# Define a new dtype with an additional field 'weight'
new_dtype = [('name', 'U10'), ('age', 'i4'), ('height', 'f4'), ('weight', 'f4')]
# Define new data including the additional field
new_data = [('Alice', 30, 5.6, 55.0), ('Bob', 25, 5.8, 70.0), ('Charlie', 35, 5.9, 80.0)]
# Create a new structured array with the additional field
new_structured_array = np.array(new_data, dtype=new_dtype)
print("New Structured Array with Additional Field:\n", new_structured_array)
Output:
After executing the above code, we get the following output −
New Structured Array with Additional Field:
[(‘Alice’, 30, 5.6, 55.) (‘Bob’, 25, 5.8, 70.) (‘Charlie’, 35, 5.9, 80.)]
Sorting structured arrays in NumPy means arranging the elements of an array based on the values of one or more fields.
Since structured arrays have multiple fields, sorting can be based on the values in these fields. For example, you might sort an array of people by their age or height.
In the following example, we are sorting a structured array based on the ‘age’ field by first obtaining the indices that would arrange the ages in ascending order. We then use these indices to reorder the entire array −
# Open Compiler
import numpy as np
# Define a structured array
dtype = [('name', 'U10'), ('age', 'i4')]
data = [('Alice', 30), ('Bob', 25), ('Charlie', 35)]
structured_array = np.array(data, dtype=dtype)
# Sort the array by 'age'
sorted_indices = np.argsort(structured_array['age'])
sorted_array = structured_array[sorted_indices]
print("Sorted by Age:\n", sorted_array)
Output:
The result produced is as follows −
Sorted by Age:
[(‘Bob’, 25) (‘Alice’, 30) (‘Charlie’, 35)]
Filtering structured arrays involves applying conditions to one or more fields and retrieving elements that satisfy these conditions.
This is useful when you want to retrieve records that meet certain criteria, such as extracting all entries where a specific field exceeds a threshold or matches a certain value.
In this example, we are filtering a structured array to include only the records where the ‘age’ field is greater than 30 −
# Open Compiler
import numpy as np
# Define a structured array
dtype = [('name', 'U10'), ('age', 'i4')]
data = [('Alice', 30), ('Bob', 25), ('Charlie', 35)]
structured_array = np.array(data, dtype=dtype)
# Filter array for ages greater than 30
filtered_array = structured_array[structured_array['age'] > 30]
print("Filtered Array (Age > 30):\n", filtered_array)
Output:
We get the output as shown below −
Filtered Array (Age > 30):[(‘Charlie’, 35)]
Combining structured arrays involves merging or concatenating arrays that have a defined dtype with named fields. In NumPy, this can be done using the np.concatenate() function.
In the example below, we are combining two structured arrays with the same dtype into a single array using np.concatenate() function −
# Open Compiler
import numpy as np
# Define two structured arrays
dtype = [('name', 'U10'), ('age', 'i4')]
data1 = [('Alice', 30), ('Bob', 25)]
data2 = [('Charlie', 35), ('Dave', 40)]
structured_array1 = np.array(data1, dtype=dtype)
structured_array2 = np.array(data2, dtype=dtype)
# Combine the arrays
combined_array = np.concatenate((structured_array1, structured_array2))
print("Combined Structured Array:\n", combined_array)
Output:
This results in a new structured array that includes all the records from both original arrays as shown below −
Combined Structured Array:
[(‘Alice’, 30) (‘Bob’, 25) (‘Charlie’, 35) (‘Dave’, 40)]
Key Takeaway: Master structured arrays in NumPy with Vista Academy!
