TOP 10 SQL COMMAND FOR MASTERING DATA ANALYTICS

26May, 2023

TOP 10 SQL COMMAND FOR MASTERING DATA ANALYTICS

This blog shows the importance of data analytics and the prospect of improving analytical abilities by using the provided SQL commands.

The core of data analytics is SQL, which enables experts to mine huge amounts of data for insightful information. The top 10 SQL commands for data analytics provide a potent toolkit that gives analysts the ability to precisely manipulate, filter, sort, aggregate, and compute data. While WHERE filters data based on specific criteria, SELECT permits the retrieval of specific columns or full datasets. Data can be sorted in ascending or descending order with the help of ORDER BY. Data can be grouped for aggregations using GROUP BY, and data from several tables can be combined using JOIN.

DISTINCT finds singular values, COUNT calculates the number of rows or non-null values, and HAVING filters grouped data. The highest and lowest values are extracted, respectively, by MAX and MIN. Data analysts may maximize the value of their datasets, find significant trends, make wise decisions, and promote company success by understanding these top 10 SQL methods.

Table of Contents

SELECT:

Data from a database table is retrieved using the SELECT query. You can choose which columns to retrieve or pick all columns by using the asterisk (*). SELECT * FROM sales in the example fetches every column and row from the “sales” table.

WHERE

Rows can be filtered using the WHERE clause based to certain criteria. You can define a requirement that must be satisfied in order for the rows to be included in the result set. In the given example, the statement SELECT * FROM sales WHERE category = ‘Electronics’ only returns rows with the word ‘Electronics’ in the “category” column.

ORDER BY:

The result set is sorted using the ORDER BY clause according to one or more columns, either ascending or descending. It always sorts in ascending order by default. In the above example, the statement SELECT * FROM sales ORDER BY price DESC arranges the result set according to the “price” column in decreasing order.

GROUP BY

Rows can be grouped based on one or more columns using the GROUP BY clause. Calculations on each group usually involve using it in combination with aggregate functions. In the given example, grouping the rows by the “category” column and computing the total quantity for each category is done by using the statements “SELECT category, SUM(quantity) as total_quantity FROM sales GROUP BY category.”

JOIN

In order to merge rows from two or more tables based on a shared column, use the JOIN clause. You can use it to get information from numerous tables with only one query. The statement JOIN categories c ON s.category = c.category_id joins the tables “sales” and “categories” based on the “category” and “category_id” columns in “sales” and “categories,” respectively.

HAVING

Groups produced by the GROUP BY clause are filtered using the HAVING clause. You can define a prerequisite that a group must meet in order to be included in the result set. According to the example, the groups are filtered such that only those with an average price higher than 100 are included.

DISTINCT:

To get distinct values from a column, use the DISTINCT keyword. Duplicate values are removed from the result set. In the given example, the statement SELECT DISTINCT category FROM sales finds unique values from the column labeled “category”.

COUNT

Count the number of rows or non-null values in a column using the COUNT function. To determine counts per group, it is frequently used in conjunction with the GROUP BY clause. In the given example, the number of rows for each category is counted using the SELECT category, COUNT(*) as count FROM sales GROUP BY category.

MAX

The MAX function is used to get a column’s highest value. It is frequently used with date-and-time or numerical columns. In the given example, selecting the maximum price from the “price” column using the command SELECT MAX(price) as max_price FROM sales.

MIN:

The MIN function is used to get a column’s lowest value. It is frequently used with date-and-time or numerical columns. The example finds the lowest price from the “price” column by using the statement SELECT MIN(price) as min_price FROM sales.

Create a table and insert data

-- Create the table
CREATE TABLE Books (
    BookID INT PRIMARY KEY,
    Title VARCHAR(255),
    Author VARCHAR(255),
    PublicationYear INT,
    Genre VARCHAR(50)
);

-- Insert data into the table
INSERT INTO Books (BookID, Title, Author, PublicationYear, Genre)
VALUES
    (1, 'Example Book 1', 'Author A', 2020, 'Fiction'),
    (2, 'Example Book 2', 'Author B', 2018, 'Mystery'),
    (3, 'Example Book 3', 'Author A', 2022, 'Science Fiction'),
    -- ... (more data)
    (10, 'Example Book 10', 'Author C', 2015, 'Thriller');

ORDER BY

SELECT * FROM Books ORDER BY PublicationYear DESC LIMIT 10;

COUNT:

SELECT Genre, COUNT(*) AS GenreCount FROM Books GROUP BY Genre;

SUM

SELECT Author, SUM(PublicationYear) AS TotalPublicationYears FROM Books GROUP BY Author;

AVG:

SELECT AVG(PublicationYear) AS AvgPublicationYear FROM Books;

Min and Max

SELECT MAX(PublicationYear) AS MaxPublicationYear, MIN(PublicationYear) AS MinPublicationYear FROM Books;

Group by

SELECT Genre, AVG(PublicationYear) AS AvgPublicationYear FROM Books GROUP BY Genre;

DISTINCT

SELECT DISTINCT Author FROM Books;

LIKE

SELECT * FROM Books WHERE Title LIKE '%Example%';

Having

SELECT Genre, AVG(PublicationYear) AS AvgPublicationYear FROM Books GROUP BY Genre HAVING AVG(PublicationYear) > 2000;