Table of Contents
Toggle
Python’s built-in types are as follows:
Python can be scripted, but it is mostly a programming language for general-purpose use. To learn more about scripting, see the Python scripting tutorial.
Python is a high-level, interpreted programming language renowned for its readability and flexibility. Its popularity comes from its ease of use, rich library resources, and strong community support.
To make a single-line comment, use the ‘#’ sign. Triple quotes (”’ or “””) can be used for multiline comments.
PEP 8 (Python Enhancement Proposal 8) is the style guide for Python code, providing conventions on how to format code for better readability.
Variables are declared without an explicit type. Example: x = 10; name = “John”.
__init__ is a Python class method that is used to initialize object attributes when a new instance is created.
List comprehension is a concise method for creating lists. Example:
[x**2 for x in range(5)].
Use try, except blocks. Example:
try:
result = 10 / 0
except ZeroDivisionError:
print(“Cannot divide by zero!”)
It indicates whether the Python script is being executed as the main program or imported as a module.
To open a file, use open(), followed by read(), readline(), or readlines().
| LIST | TUPLES |
|---|---|
| Lists are mutable i.e., they can be edited. | Tuples are immutable, meaning they cannot be edited after creation. |
| Lists are slower than tuples. | Tuples are faster than lists. |
| Syntax: list_1 = [10, ‘Chelsea’, 20] | Syntax: tup_1 = (10, ‘Chelsea’, 20) |
It is used for conditional branching. If the ‘if’ condition is false, it looks into the ‘elif’ conditions; if none of them are true, it performs the ‘else’ block.
Python’s ‘for’ loop iterates over a sequence (such as a list, tuple, or string) and runs a block of code for each entry. The loop runs until all components are handled. It uses a basic syntax:
for element in sequence:
# code to execute for each element
Python’s ‘range()’ method creates a sequence of integers. It is typically used with ‘for’ loops to iterate over an initial selection of data. The fundamental syntax is:
range(start, stop, step)
start: Optional sequence beginning value (0 is the default).
Stop: Required; produces numbers up to but not including this value.
step: Optional, the difference between each number in the series (default: 1).
‘print’ : displays information on the console for instant reading.
‘return’: Returns a value to the caller, allowing the function to provide an output that may be saved or utilized in further computations.
NumPy is a Python library for numerical operations that supports arrays and matrices, which are required for efficient numerical computations in data analytics and scientific computing.
Pandas is a Python library for data manipulation and analysis. It offers data structures like DataFrame for handling and cleaning structured data, making it a powerful tool in data analytics.
Matplotlib is a Python toolkit for producing static, interactive, and animated visualizations. It is frequently used to plot and show data in a variety of ways, which improves data analysis and communication.
Seaborn is a Python package for visualizing statistical data. It is built on Matplotlib and offers a high-level interface for producing visually appealing and informative statistical charts, making it especially helpful for data analysis and exploration.
‘loc’: Pandas label-based indexing, which allows you to choose data based on labels or criteria.
‘iloc’: Integer-location based indexing, which is used to pick data based on number indices, similar to regular Python indexing.
DataFrame: A two-dimensional, tabular data structure in Pandas. It consists of rows and columns, where each column can have a different data type. It is similar to a spreadsheet or SQL table and is a powerful tool for data manipulation and analysis in Python.
The ‘groupby’ function in Pandas is used to group rows of a DataFrame based on some criterion (for example, values in a certain column) and then apply a function to each group separately. It is useful for gathering, processing, and evaluating data from specified categories.
Handling missing values in Pandas DataFrame:
The ‘apply()’ method in Pandas is used to apply a function along the axis of a DataFrame or to a single column. It is used for changing data by applying a custom or built-in function to each element, row, or column of the DataFrame.
The Global Interpreter Lock (GIL) in Python is a technique that ensures that only one thread executes Python bytecode at a time. It reduces multi-core consumption in CPU-bound operations while improving memory management and making it easier to utilize Python in a multi-threaded environment for I/O-bound tasks.
Correlation is a statistical term that measures the degree to which two variables change together. The scale goes from -1 to 1, where:
Probability Distribution: Explains the possibility of various outcomes in a statistical experiment. It gives probabilities to each conceivable result, reflecting the likelihood of encountering that occurrence. Common distributions include uniform, normal (Gaussian), and binomial.
The Normal Distribution, often known as the Gaussian distribution, is a symmetric, bell-shaped probability distribution. It is defined by the mean (center) and standard deviation (spread). The Central Limit Theorem predicts that many natural events, such as heights or test scores, will follow this distribution.
In Pandas, use the pd.to_datetime() function.
Example: df[‘column_name’] = pd.to_datetime(‘column_name’)
Converts the supplied column to datetime format, which allows Pandas to do various time-related operations.
Moving Average: A statistical procedure that analyzes data points by generating a series of averages from various subsets of the entire dataset. It reduces short-term swings and reveals trends or patterns in time series data.
Google developed TensorFlow, an open-source machine learning framework. It is used to create and train deep learning models, making it a versatile platform for a variety of machine learning applications such as neural networks and deep neural networks.
In Python, you may reverse a list by calling the reverse() function or slicing. Here are the two methods:
The append() function adds a single entry to the end of a list.
my_list = [1, 2, 3] my_list.append(4) # Result: [1, 2, 3, 4]The extend() function adds elements from an iterable (such as a list or tuple) to the end of a list.
my_list = [1, 2, 3] my_list.extend([4, 5] # Result: [1, 2, 3, 4, 5]Append() adds a single element, but extend() adds items from an iterable, thereby integrating several entries into the original list.
The read() method returns the whole contents of a file as a single string or bytes.
The readline() method reads one line from a file and returns it as a string. It shifts the pointer to the next line on future calls.
Use the raise keyword, then specify the exception type and optional error message.
raise CustomException("This is a custom exception.") import copy
new_list = copy.copy(original_list)
import copy
new_list = copy.deepcopy(original_list)
For example, predicting housing and stock prices.
Examples include classifying emails as spam or not spam, and estimating the species of a flower.
-‘reduce()’ applies a binary function on an iterable’s items cumulatively.
– For example: from functools import reduce;
result = reduce(lambda x, y: x * y, [1, 2, 3, 4]).
def factorial(n):
if n == 0 or n == 1:
return 1
else:
return n * factorial(n-1)
my_list = [(1, 5), (2, 3), (3, 8)]
sorted_list = sorted(my_list, key=lambda x: x[1])
my_list = [1, 2, 3, 4]
doubled_list = list(map(lambda x: x * 2, my_list))
Vista Academy’s Master Program in Data Science offers in-depth training in advanced topics such as machine learning, artificial intelligence, big data analytics, and predictive modeling. Gain hands-on experience with Python, R, SQL, and TensorFlow to build a strong foundation for your career in data science.
Address: Vista Academy, 316/336, Park Rd, Laxman Chowk, Dehradun, Uttarakhand 248001
