Categories of SQL command to know for Data Analysis
Table of Contents
ToggleSQL Command
Relational databases can be managed and manipulated using the programming language SQL (Structured Query Language). Here are some crucial SQL command categories to be aware of for data analysis:
Data Definition Language (DDL)
what are the DDL commands?
DDL modifies the structure of the table by adding, removing, or changing tables, among other things.
All DDL commands are automatically committed, which permanently saves all database modifications.
The following commands are included in DDL:
Create
This command builds a new table and has a predefined syntax. The CREATE statement syntax is:
CREATE TABLE [table name] ([column definitions]) [table parameters];
For example:
CREATE TABLE employees ( id INT PRIMARY KEY, name VARCHAR(100), salary DECIMAL(10, 2) );
CREATE TABLE employees:
This part of the statement indicates that you are creating a new table named “employees.”
(id INT PRIMARY KEY,):
This line defines the first column of the table, named “id.”
INT specifies that it is an integer data type, meaning it will store whole numbers.
PRIMARY KEY indicates that the “id” column will be the primary key for the table. A primary key is a unique identifier for each record in the table.
(name VARCHAR(100),):
This line defines the second column of the table, named “name.”
VARCHAR(100) specifies that it is a variable character field, capable of storing up to 100 characters. This column is typically used for storing names or textual information.
(salary DECIMAL(10, 2)):
This line defines the third column of the table, named “salary.”
DECIMAL(10, 2) specifies a decimal data type with a total of 10 digits, where 2 digits are reserved for the decimal part. This is commonly used for storing monetary values.
In summary, the SQL statement is creating a table named “employees” with three columns: “id” for unique identification, “name” for storing names, and “salary” for storing decimal values such as salaries. The “id” column is designated as the primary key, ensuring each record in the table has a unique identifier.
Alter
An alter command modifies an existing database table. This command can add up additional column, drop existing columns and even change the data type of columns involved in a database table.
An alter command syntax is:
ALTER object type object name parameters;
For example:
ALTER TABLE employees ADD department VARCHAR(50);
In this example, we added a unique primary key to the table to add a constraint and enforce a unique value. The constraint “employee_pk” is a primary key and is on the Employee table.
Drop
A drop command is used to delete objects such as a table, index or view. A DROP statement cannot be rolled back, so once an object is destroyed, there’s no way to recover it.
Drop statement syntax is:
DROP object type object name;
For example:
DROP TABLE employees;
In this example, we’re deleting the Employee table.
Truncate
Similar to DROP, the TRUNCATE statement is used to quickly remove all records from a table. However, unlike DROP that completely destroys a table, TRUNCATE preserves its full structure to be reused later.Truncate statement syntax is:
TRUNCATE TABLE table_name;
For example:
TRUNCATE TABLE employees;
COMMENT:
This statement allows you to add comments to a table or column, which can be useful for documentation purposes.
COMMENT ON COLUMN employees.salary IS 'The salary of the employee';
DQL Command
CREATE TABLE employees ( id INT PRIMARY KEY, name VARCHAR(100), salary DECIMAL(10, 2) );
Insert Data
INSERT INTO employees (id, name, salary) VALUES (1, 'John Doe', 50000.00), (2, 'Jane Smith', 60000.00), (3, 'Michael Johnson', 55000.00), (4, 'Emily Williams', 62000.00), (5, 'Robert Brown', 48000.00), (6, 'Lisa Davis', 53000.00), (7, 'William Wilson', 58000.00), (8, 'Linda Taylor', 51000.00), (9, 'James Anderson', 54000.00), (10, 'Mary Martinez', 57000.00);This command will inser data
Step 3: Retrieve All Data
SELECT * FROM employees;This query retrieves all rows and columns from the “employees” table.
Step 4: Retrieve Specific Columns
SELECT name, salary FROM employees;This query retrieves only the “name” and “salary” columns from the “employees” table.
Step 5: Filter Data
SELECT * FROM employees WHERE salary > 55000;
DML COMMAND
- INSERT Command
- UPDATE Command
- DELETE Command
UPDATE: Modify existing data.
-- Update John Doe's salary to $55000 UPDATE employees SET salary = 55000.00 WHERE name = 'John Doe';
DELETE: Remove rows from the table.
-- Delete the employee with ID 5 DELETE FROM employees WHERE id = 5;
INSERT INTO: Add new rows to the table.
-- Insert a new employee INSERT INTO employees (id, name, salary) VALUES (11, 'Alex Johnson', 59000.00);
UPDATE Multiple Rows: You can update multiple rows at once.
-- Increase salary by 10% for employees with salary less than 55000 UPDATE employees SET salary = salary * 1.10 WHERE salary < 55000;
DELETE with Condition: Delete rows based on a condition.
-- Delete employees with a salary less than 50000 DELETE FROM employees WHERE salary < 50000;
INSERT Multiple Rows: Insert multiple rows at once.
-- Insert multiple employees INSERT INTO employees (id, name, salary) VALUES (12, 'Sarah Brown', 58000.00), (13, 'Thomas Miller', 53000.00);
dml and ddl commands
Feature | DML | DDL |
Purpose | Manipulates data within existing tables | Defines the structure of the database |
Syntax | Primarily uses INSERT, UPDATE, DELETE, and SELECT | Primarily uses CREATE, ALTER, and DROP |
Impact | Affects data within tables | Affects the overall database structure |
Examples | INSERT INTO customers (customer_id, name) VALUES (1, ‘John Doe’); | CREATE TABLE customers (customer_id INT, name VARCHAR(50)); |
Data Control Language (DCL)
GRANT
-- Grant SELECT privilege on the employees table to a user or role GRANT SELECT ON employees TO username_or_role;
REVOKE: Remove privileges from users or roles.
-- Revoke SELECT privilege on the employees table from a user or role REVOKE SELECT ON employees FROM username_or_role;DCL commands are used to manage permissions, access control, and security within the database. Keep in mind that the specific syntax and implementation might vary depending on the database system you are using.
Transaction Control Language (TCL)
Commit
This command is used to permanently save changes made in the current transaction to the database. Once committed, the changes become permanent and visible to other transactions.COMMIT;
ROLLBACK:
This command is used to undo changes made in the current transaction. It reverts the database to the state it was in before the transaction began.ROLLBACK;
SAVEPOINT:
A savepoint marks a specific point within a transaction to which you can later roll back. It allows you to selectively undo parts of a transaction while leaving other parts intact.SAVEPOINT sp1;
ROLLBACK TO SAVEPOINT:
This command undoes changes made after a specific savepoint within a transaction.RELEASE SAVEPOINT sp1;TCL commands are crucial for maintaining data integrity, managing concurrent transactions, and ensuring that changes to the database are controlled and consistent. By using these commands effectively, you can handle transactions in a way that aligns with your application’s requirements and maintains the reliability of your data.
Difference Between Category of command SQL
Category | Purpose | Examples |
Data Definition Language (DDL) | Defines the database structure. | CREATE TABLE, ALTER TABLE, DROP TABLE, TRUNCATE TABLE |
Data Manipulation Language (DML) | Manipulates data within tables. | INSERT, UPDATE, DELETE, SELECT |
Data Control Language (DCL) | Controls access to the database. | GRANT, REVOKE |
Transaction Control Language (TCL) | Manages transactions. | COMMIT, ROLLBACK, SAVEPOINT |
Answer: The SQL SELECT statement is used to retrieve data from one or more tables. It allows you to specify which columns to retrieve and apply filtering conditions using the WHERE clause.
Answer: You can filter rows from a table using the WHERE clause in conjunction with the SELECT statement. The WHERE clause allows you to specify conditions that the rows must meet to be included in the result set.
Answer: The GROUP BY clause is used to group rows with similar values in specified columns. It’s often used in combination with aggregate functions to perform calculations on grouped data.
Answer: The INNER JOIN returns only the matching rows from both tables based on the specified condition. On the other hand, the LEFT JOIN returns all the rows from the left table and the matching rows from the right table. If there’s no match, NULL values are returned for the right table’s columns.
Answer: You can calculate the average of a specific column using the AVG aggregate function in combination with the SELECT statement and optionally the GROUP BY clause.
Answer: The SQL INSERT statement is used to add new rows into a table. It allows you to specify the values for each column in the new row.
Answer: A primary key is a column or a set of columns that uniquely identify each row in a table. It ensures data integrity by preventing duplicate or null values in the key columns.
Answer: The ORDER BY clause is used to sort the result set of a query based on one or more columns. You can specify the sorting order (ascending or descending) for each column.
The SQL ALTER TABLE statement is used to modify the structure of an existing table, such as adding or dropping columns, changing data types, or adding constraints.
A subquery is a nested query that can be used within another query. It can be employed to retrieve data that will be used for filtering, calculations, or comparisons in the outer query. Subqueries are enclosed in parentheses and can appear in various parts of a SQL statement, such as the WHERE clause or the FROM clause.