SQL for data analysis | The SQL Tutorial for Data Analysis
Table of Contents
ToggleIntroduction to data Analytics
It is the process deriving useful insight from data in order to solve real world problems
Data has been the buzzword for ages now. Either the data being generated from large-scale enterprises or the data generated from an individual, each and every aspect of data needs to be analyzed to benefit yourself from it. But how do we do it? Well, that’s where the term ‘Data Analytics’ comes in. In this blog on ‘What is Data Analytics?’, you will get an insight of this term with a hands-on.
We are generating more than 2.5 quintillion bytes of data each year. this pace of data generation is the reason behind the popularity of high-end technologies such as data science artificial intelligence machine learning and so on. The data is the key in data science.
Data science involve extracting, processing and analyzing tons of data at present what we need are tool that can be used to store and manage this vast amount of data.
What is the role of Data Analytics?
- Gather Hidden Insights – Hidden insights from data are gathered and then analyzed with respect to business requirements.
- Generate Reports – Reports are generated from the data and are passed on to the respective teams and individuals to deal with further actions for a high rise in business.
- Perform Market Analysis – Market Analysis can be performed to understand the strengths and weaknesses of competitors.
- Improve Business Requirement – Analysis of Data allows improving Business to customer requirements and experience.
Why we need SQL for data analytics ?
SQL can be used to store access and extract massive amounts of data in order to carry out the whole data science process smoothly .
What is SQL ?
SQL stands for structured query language is a querying language amid to manage relational database.
With SQL you can modify database,add,update, or delete rows of data retrieve subsets of information from database and any more.
Queries and other SQL operation are written as statement example select ,insert ,add,update ,delete,create,alter,insert.
Types of database
- Relational Database
- Non-Relational database
What is relational database ?
Relational database a group of well-defined Table from which data can be accessed or it can be edited and updated and no need to alter database table.
RDBMS is data management system where data can be store into different tables (Columns and rows) and relations are established among the different data variable by using primary keys and so on.
SQL is standard API for Relational database.
what is database
A database is a collection of data stored in a format that can easily be accessed.
A database is an organized collection of structured data that is stored and managed on a computer system. It is designed to store, retrieve, and manage large amounts of data efficiently and securely.
In a database, data is stored in tables, which consist of rows and columns. Each row in a table represents a record, and each column represents a field or attribute of the record. The relationship between tables is established using keys that connect related data.
Databases can be used to store a variety of data, such as customer information, product data, financial transactions, and more. They are widely used in various industries, including finance, healthcare, retail, and e-commerce, to manage and analyze large amounts of data.
Database management systems (DBMS) are software programs that are used to manage and manipulate databases. Some of the most popular DBMSs include Oracle, MySQL, Microsoft SQL Server, PostgreSQL, and MongoDB. These systems provide tools for creating, modifying, and querying databases, as well as for securing and backing up data to ensure data integrity and availability.
Real life example of database
SQL programming can be used to perform multiple action on data such as :
- Querying
- Inserting
- Updating
- Deleting
- Extracting and so on.
It is manipulating and analyse data in such a way that you derive something useful from data.
Example of Relational data that use SQL are MySQL and oracle.
Why to use MYSQL ?
- Very easy to use.
- Need basic knowledge of SQL
- With few simple SQL statement.
- It is very secure RDBMS.
- It is open source.
- It is scalable.
- The default size limit 4 GB and can be increase up to 8 TB
- MySQL follow client server architecture.
- MYSQL is compatible on many operating system.(windows,linux,Unix,Mac)
- There are number of API and libraries for development of MySQL application.
- You can easily work with python and SQL.
- It is platform independents.
- MySQL is very fast RDBMS.
Some Data types under MYSQL
- Numeric
- Character Sting
- Bit-String
- Boolean
- Date and time
- Timestamp and interval
Start My SQL
COMMAND IS MYSQL
COMMAND IS SQL
The first command is SQL is to create Database:
Syntax:
CREATE DATABASE : name;
USE :name;
CREATE DATABASE : to create database
USE: activate the database:
Note : after end of each command there is a terminator which is semicolon ;
Command are written in capital letter so user can differentiate between command and table columns.
Use Database in MySQL
You can use the SQL command use to select a database.
CREATE TABLE IN MYSQL
This Function create a new table
Syntax:
CREATE TABLE name (variable1 datatype 1,variable2 datatype2);
- CREATE TABLE: this command create a table containing the data variable
- Each variable must be defined w.r.t data type.
Example if your variable is age than your datatype will be integer;
if your variable is name than your datatype will be character;
CREATE TABLE personal (
ID INT,
name VARCHAR(50),
birth_date DATE,
Phone VARCHAR (12),
gender VARCHAR (1)
);
after creating table
- Press execute button
- refresh button
- see the picture below
INSERT Multiple Rows IN MYSQL
Syntax;
INSERT INTO table_name (column 1,column 2,…..,)
VALUES
(Value1,Value 2,…),
(Value1,Value 2,…),
(Value1,Value 2,…),;
INSERT INTO personal (ID, name, birth_date, Phone, gender)
VALUES (1, “Parul Sharma”, “1991-10-05”, “941177878”, “F”),
(2, “Aman Singh”, “1990-05-09”, “942377828”, “M”),
(3, “Rajeev Kapoor”, “1990-05-16”, “989077878”, “M”),
(4, “Vivek Kapoor”, “1994-11-02”, “991557878” ,”M”);
Command to select the table data
SELECT * FROM students.personal;
List of Constraints in MYSQL
The constraint in MySQL is used to specify the rule that allows or restricts what values/data will be stored in the table. They provide a suitable method to ensure data accuracy and integrity inside the table. It also helps to limit the type of data that will be inserted inside the table. If any interruption occurs between the constraint and data action, the action is failed.
Types of MySQL
Constraints
Constraints in MySQL is classified into two types:
Column Level Constraints: These constraints are applied only to the single column that limits the type of particular column data.
Table Level Constraints: These constraints are applied to the entire table that limits the type of data for the whole table.
How to create constraints in MySQL
We can define the constraints during a table created by using the CREATE TABLE statement. MySQL also uses the ALTER TABLE statement to specify the constraints in the case of the existing table schema.
Syntax
The following are the syntax to create a constraints in table:
CREATE TABLE new_table_name (
col_name1 datatype constraint,
col_name2 datatype constraint,
col_name3 datatype constraint,
………
);
Constraints used in MySQL
The following are the most common constraints used in the MySQL:
- NOT NULL
- CHECK
- DEFAULT
- PRIMARY KEY
- AUTO_INCREMENT
- UNIQUE
- INDEX
- ENUM
- FOREIGN KEY
NOT NULL Constraint
This constraint specifies that the column cannot have NULL or empty values. The below statement creates a table with NOT NULL constraint.
CREATE TABLE Student(Id INTEGER, LastName TEXT NOT NULL, FirstName TEXT NOT NULL, City VARCHAR(35));
UNIQUE Constraint
UNIQUE Constraint
This constraint ensures that all values inserted into the column will be unique. It means a column cannot stores duplicate values. MySQL allows us to use more than one column with UNIQUE constraint in a table. The below statement creates a table with a UNIQUE constraint:
mysql> CREATE TABLE ShirtBrands(Id INTEGER, BrandName VARCHAR(40) UNIQUE, Size VARCHAR(30));
CHECK Constraint
It controls the value in a particular column. It ensures that the inserted value in a column must be satisfied with the given condition. In other words, it determines whether the value associated with the column is valid or not with the given condition.
mysql> CREATE TABLE Persons (
ID int NOT NULL,
Name varchar(45) NOT NULL,
Age int CHECK (Age>=18)
);
DEFAULT Constraint
This constraint is used to set the default value for the particular column where we have not specified any value. It means the column must contain a value, including NULL.
mysql> CREATE TABLE Persons (
ID int NOT NULL,
Name varchar(45) NOT NULL,
Age int,
City varchar(25) DEFAULT ‘New York’
);
PRIMARY KEY Constraint
This constraint is used to identify each record in a table uniquely. If the column contains primary key constraints, then it cannot be null or empty. A table may have duplicate columns, but it can contain only one primary key. It always contains unique value into a column.
REATE TABLE Persons (
ID int NOT NULL PRIMARY KEY,
Name varchar(45) NOT NULL,
Age int,
City varchar(25));
UTO_INCREMENT Constraint
This constraint automatically generates a unique number whenever we insert a new record into the table. Generally, we use this constraint for the primary key field in a table.
mysql> CREATE TABLE Animals(
id int NOT NULL AUTO_INCREMENT,
name CHAR(30) NOT NULL,
PRIMARY KEY (id));
ENUM Constraint
ENUM Constraint
The ENUM data type in MySQL is a string object. It allows us to limit the value chosen from a list of permitted values in the column specification at the time of table creation. It is short for enumeration, which means that each column may have one of the specified possible values. It uses numeric indexes (1, 2, 3…) to represent string values.
mysql> CREATE TABLE Shirts (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(35),
size ENUM(‘small’, ‘medium’, ‘large’, ‘x-large’)
);
INDEX Constraint
This constraint allows us to create and retrieve values from the table very quickly and easily. An index can be created using one or more than one column. It assigns a ROWID for each row in that way they were inserted into the table.
mysql> CREATE TABLE Shirts (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(35),
size ENUM(‘small’, ‘medium’, ‘large’, ‘x-large’)
);
Foreign Key Constraint
This constraint is used to link two tables together. It is also known as the referencing key. A foreign key column matches the primary key field of another table. It means a foreign key field in one table refers to the primary key field of another table.
CREATE TABLE Persons (
Person_ID int NOT NULL PRIMARY KEY,
Name varchar(45) NOT NULL,
Age int,
City varchar(25)
);
Constraints Example
Syntax
CREATE TABLE tabe_name (ID INT NOT NULL UNIQUE,
name VARCHAR(40) NOT NULL,
age INT NOT NULL CHECK(age>=18),
gender VARCHAR(1) NOT NULL,
phone VARCHAR(10) UNIQUE,
city VARCHAR(12) DEFAULT “Dehradun”
Insert table data
MySQL AND, OR, NOT
The MySQL AND, OR and NOT Operators
The WHERE clause can be combined with AND, OR, and NOT operators.
The AND and OR operators are used to filter records based on more than one condition:
The AND operator displays a record if all the conditions separated by AND are TRUE.
The OR operator displays a record if any of the conditions separated by OR is TRUE.
The NOT operator displays a record if the condition(s) is NOT TRUE.
AND Syntax
SELECT column1, column2, …
FROM table_name
WHERE condition1 AND condition2 AND condition3 …;
AND OPERATOR IN MYSQL
SELECT * FROM personal WHERE age >= 18 AND age <= 20;
It shows all the rows with age between 18 to 20
OR OPERATOR IN MYSQL
SELECT * FROM personal WHERE age <= 19 OR gender = “F”;
If any one of critera is true then the answer will be shown.
MySQL IN & NOT IN OPERATOR
The WHERE in MySQL clause, when used together with the IN keyword only affects the rows whose values matches the list of values provided in the IN
SELECT * FROM personal WHERE age IN(18,21);
Its very similar to OR operator where it will show data of 18 or 21.
MYSQL NOT IN OPERATOR
The WHERE clause when used together with the NOT IN keyword DOES NOT affects the rows whose values matches the list of values provided in the NOT IN keyword
MySQL BETWEEN AND OPERATOR
The BETWEEN operator selects values within a given range. The values can be numbers, text, or dates.
SELECT * FROM personal WHERE age BETWEEN 18 AND 20;
MySQL NOT BETWEEN AND operator
MySQL NOT BETWEEN AND operator checks whether a value is not present between a starting and a closing expression.
SELECT * FROM personal WHERE age NOT BETWEEN 18 AND 20;
SQL LIKE Operator
The LIKE operator is used in a WHERE clause to search for a specified pattern in a column.
There are two wildcards often used in conjunction with the LIKE operator:
The percent sign (%) represents zero, one, or multiple characters
The underscore sign (_) represents one, single character
SELECT * FROM personal WHERE name LIKE “s%”;
It start with S only.
MySQL ORDER BY COMMAND
- SELECT * FROM personal ORDER BY name;
This will ordered by name from table.
- SELECT * FROM personal WHERE city = “Dehradun” ORDER BY name DESC;
This will ordered by name in decending order where city is Dehrdun
- SELECT * FROM personal ORDER BY age;
This will ordered list in table according to age.
SELECT DISTINCT COMMAND IN MYSQL
The SELECT DISTINCT statement is used to return only distinct (different) values.
Inside a table, a column often contains many duplicate values; and sometimes you only want to list the different (distinct) values.
SELECT DISTINCT Syntax
SELECT DISTINCT column1, column2, …
FROM table_name;
MySQL IS NULL & IS NOT NULL
NULL VALUE
A field with a NULL value is a field with no value.
If a field in a table is optional, it is possible to insert a new record or update a record without adding a value to this field. Then, the field will be saved with a NULL value.
Syntax.
SELECT column_names
FROM table_name
WHERE column_name IS NULL;
IS NOT NULL IS WHERE IS NOT NULL
NOT NULL
IS NOT NULL Syntax
SELECT column_names
FROM table_name
WHERE column_name IS NOT NULL;
The SQL UPDATE Statement
The UPDATE statement is used to modify the existing records in a table.
UPDATE Syntax
UPDATE table_name
SET column1 = value1, column2 = value2, …
WHERE condition;
The SQL DELETE Statement
The SQL DELETE Statement
The DELETE statement is used to delete existing records in a table.
DELETE Syntax
DELETE FROM table_name WHERE condition;
Example :
DELETE FROM personal WHERE id = 4;
LIMIT Clause in SQL
The LIMIT clause is a SQL command that is used to limit the number of results returned by a SELECT statement. It is especially useful when you are working with large tables that contain many rows, and you only need to retrieve a specific number of results.
The syntax of the LIMIT clause is as follows:
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(50),
email VARCHAR(50)
);
INSERT INTO customers (id, name, email)
VALUES
(1, ‘John Smith’, ‘john@example.com’),
(2, ‘Jane Doe’, ‘jane@example.com’),
(3, ‘Bob Johnson’, ‘bob@example.com’),
(4, ‘Alice Brown’, ‘alice@example.com’),
(5, ‘David Lee’, ‘david@example.com’),
(6, ‘Karen Davis’, ‘karen@example.com’),
(7, ‘Tom Williams’, ‘tom@example.com’),
(8, ‘Samantha White’, ‘samantha@example.com’),
(9, ‘Peter Green’, ‘peter@example.com’),
(10, ‘Linda Brown’, ‘linda@example.com’),
(11, ‘Mark Wilson’, ‘mark@example.com’),
(12, ‘Emily Davis’, ’emily@example.com’),
(13, ‘Daniel Lee’, ‘daniel@example.com’),
(14, ‘Karen Thomas’, ‘karen.thomas@example.com’),
(15, ‘Michael Jones’, ‘michael@example.com’);
SELECT *
FROM customers
LIMIT 5;
Joins in SQL
Joins in SQL are used to combine data from two or more tables based on a related column or set of columns. There are four types of joins in SQL
Inner Join in SQl
An inner join returns only the rows that have matching values in both tables. It returns a new table that contains only the rows that have matching values in the specified columns of both tables.
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(50),
email VARCHAR(50),
city VARCHAR(50)
);
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT,
product_name VARCHAR(50),
quantity INT,
price DECIMAL(10, 2),
FOREIGN KEY (customer_id) REFERENCES customers(id)
);
INSERT INTO customers (id, name, email, city)
VALUES
(1, ‘John Smith’, ‘john@example.com’, ‘New York’),
(2, ‘Jane Doe’, ‘jane@example.com’, ‘Los Angeles’),
(3, ‘Bob Johnson’, ‘bob@example.com’, ‘Chicago’),
(4, ‘Alice Brown’, ‘alice@example.com’, ‘Houston’);
INSERT INTO orders (id, customer_id, product_name, quantity, price)
VALUES
(1, 1, ‘iPhone’, 2, 999.99),
(2, 3, ‘Samsung Galaxy’, 1, 799.99),
(3, 2, ‘iPad’, 3, 699.99),
(4, 1, ‘Macbook Pro’, 1, 1799.99),
(5, 4, ‘Microsoft Surface’, 2, 899.99);
SELECT customers.name, customers.city, orders.product_name, orders.quantity, orders.price
FROM customers
INNER JOIN orders
ON customers.id = orders.customer_id;
In this example, we first create a table called “customers” with four columns: “id”, “name”, “email”, and “city”. We set the “id” column as the primary key. We also create another table called “orders” with five columns: “id”, “customer_id”, “product_name”, “quantity”, and “price”. The “customer_id” column in the “orders” table is a foreign key that references the “id” column in the “customers” table.
Next, we insert data into both tables using the INSERT INTO statement.
Finally, we make an INNER JOIN between the “customers” and “orders” tables using the SELECT statement. We select the “name” and “city” columns from the “customers” table and the “product_name”, “quantity”, and “price” columns from the “orders” table. We use the ON keyword to specify the join condition, which is that the “id” column in the “customers” table must match the “customer_id” column in the “orders” table. This will return a result set that combines the data from both tables based on the join condition.
Left Join
LEFT JOIN between the “customers” and “orders” tables in the example I provided earlier. Here’s how to modify the SELECT statement to use a LEFT JOIN instead of an INNER JOIN:
SELECT customers.name, customers.city, orders.product_name, orders.quantity, orders.price
FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id;
In this modified query, we use the LEFT JOIN keyword instead of INNER JOIN. This will return all rows from the “customers” table, regardless of whether they have matching rows in the “orders” table. If a row in the “customers” table doesn’t have a matching row in the “orders” table, the values for the “product_name”, “quantity”, and “price” columns in the result set will be NULL.
Note that in this example, we’re joining the “customers” table on the left and the “orders” table on the right, because we want to include all rows from the “customers” table in the result set. If we had swapped the order of the tables in the FROM clause and used a RIGHT JOIN instead of LEFT JOIN, we would include all rows from the “orders” table in the result set instead.
Right Join in SQL
This is a SQL query that performs a RIGHT JOIN between two tables customers and orders. The result set includes the name and city columns from the customers table, and the product_name, quantity, and price columns from the orders table.
Here is the explanation of the query:
SELECT customers.name, customers.city, orders.product_name, orders.quantity, orders.price
columns from them. The query uses a RIGHT JOIN to ensure that all rows from the “orders” table are included in the result, even if there is no corresponding row in the “customers” table.
The columns selected from the “customers” table are “name” and “city”, while the columns selected from the “orders” table are “product_name”, “quantity”, and “price”.
The query can be explained using the following steps:
FROM customers RIGHT JOIN orders: This specifies the two tables to be joined and the type of join to be used. In this case, a RIGHT JOIN is used, meaning that all rows from the “orders” table will be included in the result, and only matching rows from the “customers” table will be included.
ON customers.id = orders.customer_id: This specifies the join condition, which determines how the rows from the two tables are matched. In this case, the join is based on the “id” column in the “customers” table and the “customer_id” column in the “orders” table.
SELECT customers.name, customers.city, orders.product_name, orders.quantity, orders.price: This selects the specific columns to be included in the result. The columns from the “customers” table are “name” and “city”, while the columns from the “orders” table are “product_name”, “quantity”, and “price”.
Overall, this query will return a result set that includes all orders, along with the name and city of the customer who placed each order (if available). If there is no corresponding customer for an order, the name and city columns will be NULL
Full Join in SQl
As mentioned earlier, SQL does not have a FULL JOIN syntax. You can achieve the same result using a combination of LEFT JOIN, RIGHT JOIN and UNION. Here is an example of a FULL JOIN between the customers and orders tables using a UNION clause:
SELECT c.id, c.name, c.email, c.city, o.product_name, o.quantity, o.price
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
UNION
SELECT c.id, c.name, c.email, c.city, o.product_name, o.quantity, o.price
FROM customers c
RIGHT JOIN orders o ON c.id = o.customer_id
WHERE c.id IS NULL;
This query uses a LEFT JOIN to select all customers and their orders. It then uses a UNION clause to combine this result set with the result set of a RIGHT JOIN that selects all orders and their associated customers. Finally, a WHERE clause filters out any rows where the customer ID is null, which indicates that the customer does not exist.
Note that in this example, the id column of the customers table is used as the primary key, while the customer_id column of the orders table is used as a foreign key to reference the id column of the customers
Joining more than two tables
JOINing more than two tables is a common task in SQL when dealing with complex database schemas. Here’s an example of how to JOIN three tables using the JOIN keyword:
Suppose we have three tables: “orders”, “customers”, and “products”. The “orders” table contains information about each order, including the customer ID and the product ID. The “customers” table contains information about each customer, including the customer ID and name. The “products” table contains information about each product, including the product ID and name.
here’s an example of how to create the “orders”, “customers”, and “products” tables, insert data into them, and JOIN them together:
First, let’s create the “customers” table:
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(50) NOT NULL,
customer_email VARCHAR(50) NOT NULL
);
Next, let’s insert some data into the “customers” table:
INSERT INTO customers (customer_id, customer_name, customer_email)
VALUES
(1, ‘Alice’, ‘alice@example.com’),
(2, ‘Bob’, ‘bob@example.com’),
(3, ‘Charlie’, ‘charlie@example.com’);
Now let’s create the “products” table
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(50) NOT NULL,
product_price DECIMAL(10,2) NOT NULL
);
And let’s insert some data into the “products” table:
INSERT INTO products (product_id, product_name, product_price)
VALUES
(1, ‘Widget’, 19.99),
(2, ‘Gadget’, 29.99),
(3, ‘Thingamajig’, 39.99);
Finally, let’s create the “orders” table:
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT NOT NULL,
product_id INT NOT NULL,
order_date DATE NOT NULL,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id),
FOREIGN KEY (product_id) REFERENCES products(product_id)
);
INSERT INTO orders (order_id, customer_id, product_id, order_date)
VALUES
(1, 1, 1, ‘2022-01-01’),
(2, 2, 2, ‘2022-01-02’),
(3, 3, 3, ‘2022-01-03’);
SELECT o.order_id, c.customer_name, p.product_name
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id;
In this query, we use the JOIN keyword to JOIN the “orders” table with the “customers” table and then JOIN the result with the “products” table. We specify the ON clause for each JOIN to specify the join condition, which specifies how the rows from the two tables being joined are matched. In this case, we match the customer ID in the “orders” table with the customer ID in the “customers” table, and we match the product ID in the “orders” table with the product ID in the “products” table.
Note that the table aliases “o”, “c”, and “p” are used to qualify the column names in the SELECT clause and the join conditions in the ON clauses. These aliases are optional, but can make the query easier to read and understand.
In general, you can JOIN any number of tables in SQL by simply adding more JOIN clauses to your query. However, it’s important to keep in mind that JOINing many tables can be computationally expensive and can slow down your queries, so it’s a good practice to optimize your queries and database schema to minimize the number of JOINs needed.
Group By statement in SQL
Suppose we want to create a table to store sales data for a retail store. We can create a sales table with the following schema.
CREATE TABLE sales (
id INT PRIMARY KEY,
date DATE,
customer_name VARCHAR(50),
product_name VARCHAR(50),
price DECIMAL(10, 2),
quantity INT
);
Insert Data into Table
INSERT INTO sales (id, date, customer_name, product_name, price, quantity)
VALUES
(1, ‘2022-01-01’, ‘Alice’, ‘Shoes’, 50.00, 2),
(2, ‘2022-01-01’, ‘Bob’, ‘Shoes’, 50.00, 1),
(3, ‘2022-01-02’, ‘Alice’, ‘Shirt’, 20.00, 3),
(4, ‘2022-01-02’, ‘Bob’, ‘Shirt’, 20.00, 2),
(5, ‘2022-01-02’, ‘Charlie’, ‘Pants’, 30.00, 1),
(6, ‘2022-01-03’, ‘David’, ‘Pants’, 30.00, 2),
(7, ‘2022-01-03’, ‘David’, ‘Shoes’, 50.00, 1);
let’s use the GROUP BY statement to calculate the total revenue for each product:
SELECT product_name, SUM(price * quantity) AS total_revenue
FROM sales
GROUP BY product_name;
Aggregate in SQL
In SQL, the aggregate functions are used to perform calculations on a set of values and return a single value as a result. The most commonly used aggregate functions are:
COUNT: Returns the number of rows in a table that satisfy the specified condition.
SUM: Returns the sum of the values in a specified column.
AVG: Returns the average of the values in a specified column.
MAX: Returns the maximum value in a specified column.
MIN: Returns the minimum value in a specified column.
To use an aggregate function in SQL, you typically need to specify the column(s) you want to aggregate and use the appropriate function. Here is an example of using the COUNT function to count the number of records in a table:
CREATE TABLE student (
id INT PRIMARY KEY,
name VARCHAR(50),
age INT,
score INT
);
INSERT INTO student (id, name, age, score) VALUES
(1, ‘Alice’, 20, 80),
(2, ‘Bob’, 21, 75),
(3, ‘Charlie’, 19, 90),
(4, ‘David’, 22, 85),
(5, ‘Emily’, 20, 95);
SELECT COUNT(*) AS total_students FROM student;
This will return the total number of students in the “student” table.
SELECT AVG(age) AS avg_age FROM student;
This will return the average age of all the students in the “student” table.
SELECT MAX(score) AS highest_score FROM student;
This will return the highest score achieved by any student in the “student” table
SELECT MIN(score) AS lowest_score FROM student WHERE age > 20;
This will return the lowest score achieved by any student who is over the age of 20.
SELECT SUM(score) AS total_score FROM student;
This will return the total score achieved by all students in the “student” table.
SELECT COUNT(*) AS num_students_above_avg_age
FROM student
WHERE age > (SELECT AVG(age) FROM student);
This will return the number of students in the “student” table who are older than the average age of all the students.
SELECT name, MAX(score) AS highest_score
FROM student
GROUP BY name;
SELECT AVG(score) AS top_5_avg_score
FROM (
SELECT score
FROM student
ORDER BY score DESC
LIMIT 5
) AS top_5_students;
This query first selects the top 5 students based on their score, by ordering the rows in descending order of the “score” column and limiting the result to the first 5 rows. It then calculates the average score of these 5 students using the AVG function.