SQL handbook
Content
- Common SQL query pattern or structure
- List of some common SQL functions
- Some advanced topics
Common SQL query pattern or structure
SELECT [column1, column2, ...] -- List of columns to be returned in the result set
FROM [table1] -- Name of the table(s) to retrieve data from
[JOIN [table2] ON condition] -- Join condition(s) to join the table(s)
WHERE [condition] -- Filter condition(s) to filter the rows
GROUP BY [column1, column2, ...] -- Grouping condition(s) to group rows
HAVING [condition] -- Filtering condition(s) for grouped rows
ORDER BY [column1, column2, ...] -- Sorting condition(s) for the result set
LIMIT [number] -- Limiting the number of rows returned in the result set
Not all of these clauses are necessary for every query, and the specific syntax may vary depending on the SQL dialect you are using.
Here’s an example query that follows this pattern:
SELECT customer_name, SUM(order_total) as total_spent
FROM customers
JOIN orders ON customers.customer_id = orders.customer_id
WHERE order_date >= '2022-01-01'
GROUP BY customer_name
HAVING SUM(order_total) > 1000
ORDER BY total_spent DESC
LIMIT 10;
This query retrieves the name of customers who have spent more than $1000 since January 1, 2022, joins the customers and orders tables based on their IDs, groups the results by customer name, filters the results by the sum of order total, sorts the results by total spent in descending order, and limits the results to the top 10.
Explanation for this simple query
SELECT customer_name, SUM(order_total) as total_spent
: This clause specifies the columns to be returned in the result set. We want to retrieve thecustomer_name
and the sum of theorder_total
for each customer.FROM customers
: This clause specifies the table to retrieve data from, which is thecustomers
table in this case.JOIN orders ON customers.customer_id = orders.customer_id
: This clause specifies the join condition to join thecustomers
andorders
tables. The join is based on thecustomer_id
column that exists in both tables.WHERE order_date >= '2022-01-01'
: This clause specifies the filter condition to filter the rows from theorders
table. We only want to retrieve the rows that have anorder_date
on or after January 1, 2022.GROUP BY customer_name
: This clause specifies the grouping condition to group the rows bycustomer_name
. This allows us to apply aggregate functions to theorder_total
column for each group.HAVING SUM(order_total) > 1000
: This clause specifies the filter condition for grouped rows. We only want to include the groups that have a sum oforder_total
greater than $1000.ORDER BY total_spent DESC
: This clause specifies the sorting condition for the result set. We want to sort the result set by thetotal_spent
column in descending order.LIMIT 10
: This clause specifies the limit condition to limit the number of rows returned in the result set. We only want to retrieve the top 10 rows.
explanatory structure for SQL
2. List of some common SQL functions
- Numeric functions: ABS, CEIL, FLOOR, ROUND, TRUNC, EXP, LN, LOG, MOD, POWER, SIGN, SQRT
- String functions: CONCAT, SUBSTRING, LENGTH, CHARINDEX, LOWER, UPPER, LTRIM, RTRIM, REPLACE, LEFT, RIGHT, TRIM, REPLICATE, SPACE.
- Date and time functions: CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, DATEADD, DATEDIFF, DATEPART, DAY, MONTH, YEAR, HOUR, MINUTE, SECOND, GETDATE.
- Aggregate functions: AVG, COUNT, MAX, MIN, SUM.
- Conditional functions: CASE, COALESCE, IFNULL, NULLIF.
- Conversion functions: CAST, CONVERT, TO_CHAR, TO_DATE, TO_NUMBER.
- Mathematical functions: ACOS, ASIN, ATAN, COS, SIN, TAN.
- Ranking functions: RANK, DENSE_RANK, ROW_NUMBER.
- Analytical functions: LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTILE, PERCENT_RANK.
A sort explanation for those function —
- Numeric functions:
ABS: returns the absolute value of a number.
CEIL: returns the smallest integer that is greater than or equal to a number.
FLOOR: returns the largest integer that is less than or equal to a number.
ROUND: rounds a number to the nearest integer.
TRUNCATE: truncates a number to a specified number of decimal places.
EXP: returns the exponential value of a number.
LN: returns the natural logarithm of a number.
LOG: returns the logarithm of a number to a specified base.
MOD: returns the remainder of a division operation.
POWER: returns the value of a number raised to a specified power.
SIGN: returns the sign of a number (-1 for negative, 0 for zero, 1 for positive).
SQRT: returns the square root of a number.
2. String functions:
CONCAT: concatenates two or more strings.
SUBSTRING: returns a substring of a string.
LENGTH: returns the length of a string.
CHARINDEX: returns the position of a substring in a string.
LOWER: returns a string with all uppercase characters converted to lowercase.
UPPER: returns a string with all lowercase characters converted to uppercase.
TRIM: removes leading and trailing spaces from a string.
REPLACE: replaces all occurrences of a substring in a string with another substring.
LEFT: returns the leftmost characters of a string.
RIGHT: returns the rightmost characters of a string.
REPEAT: repeats a string a specified number of times.
LPAD: pads a string with a specified character on the left.
RPAD: pads a string with a specified character on the right.
ASCII: returns the ASCII value of the first character of a string.
CONCAT_WS: concatenates strings with a specified separator.
FIELD: returns the index of a value in a list of values.
FIND_IN_SET: returns the position of a value in a comma-separated list of values.
FORMAT: formats a number with a specified number of decimals and thousands separator.
HEX: converts a number to its hexadecimal representation.
REVERSE: reverses a string.
SOUNDEX: returns a phonetic representation of a string.
SPACE: returns a string with a specified number of spaces.
STRCMP: compares two strings.
SUBSTR: returns a substring of a string.
UCASE: returns a string with all lowercase characters converted to uppercase.
3. Date and time functions:
CURDATE: returns the current date.
CURTIME: returns the current time.
CURRENT_DATE: returns the current date.
CURRENT_TIME: returns the current time.
CURRENT_TIMESTAMP: returns the current date and time.
DATE_ADD: adds a specified interval to a date.
DATE_SUB: subtracts a specified interval from a date.
DATE_FORMAT: formats a date as a string.
DATEDIFF: returns the number of days between two dates.
DAY: returns the day of the month for a date.
MONTH: returns the month for a date.
YEAR: returns the year for a date.
HOUR: returns the hour for a time or datetime value.
MINUTE: returns the minute for a time or datetime value.
SECOND: returns the second for a time or datetime value.
NOW: returns the current date and time.
TIME
4. Aggregate functions: These functions perform calculations on a set of values and return a single result. Some commonly used aggregate functions in MySQL include:
AVG: returns the average value of a set of values.
COUNT: returns the number of values in a set of values.
MAX: returns the maximum value in a set of values.
MIN: returns the minimum value in a set of values.
SUM: returns the sum of values in a set of values.
GROUP_CONCAT: concatenates values in a set of values into a single string.
5. Conditional functions: These functions perform different calculations based on specified conditions. Some commonly used conditional functions in MySQL include:
CASE: performs conditional operations.
IF: returns one value if a condition is true and another value if the condition is false.
NULLIF: returns NULL if two values are equal.
6. Conversion functions: These functions convert data from one type to another. Some commonly used conversion functions in MySQL include:
CAST: converts a value from one data type to another.
CONVERT: converts a value from one character set to another.
7. Mathematical functions: These functions perform mathematical calculations. Some commonly used mathematical functions in MySQL include:
ABS: returns the absolute value of a number.
CEILING: returns the smallest integer that is greater than or equal to a number.
FLOOR: returns the largest integer that is less than or equal to a number.
POWER: returns the value of a number raised to a specified power.
ROUND: rounds a number to the nearest integer.
TRUNCATE: truncates a number to a specified number of decimal places.
8. Ranking functions: These functions assign a rank to each row based on specified criteria. Some commonly used ranking functions in MySQL include:
RANK: assigns a rank to each row based on the value of a specified expression.
DENSE_RANK: assigns a rank to each row based on the value of a specified expression, with no gaps in the ranking.
9. Analytical functions: These functions perform calculations on a set of rows that are related to the current row. Some commonly used analytical functions in MySQL include:
LEAD: returns the value of a specified expression for the next row in the result set.
LAG: returns the value of a specified expression for the previous row in the result set.
OVER: specifies a window of rows for use with analytical functions.
3. Some advanced topics
Advanced SQL topics build on the basic concepts of SQL and provide more powerful tools for working with data. Here are some common advanced SQL topics:
- Subqueries: A subquery is a query within another query. Subqueries can be used to retrieve data that will be used in the main query or to filter data based on results of a subquery. Subqueries can be written in various ways, such as with the IN, EXISTS, or ANY/ALL operators. Example: Find all employees whose salary is greater than the average salary of all employees in their department.
SELECT *
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees WHERE department_id = employees.department_id)
- Window Functions: Window functions are a set of SQL functions that allow you to perform calculations across a set of rows related to the current row. Window functions operate on a “window” of rows, which can be defined using the OVER clause. Examples of window functions include RANK, DENSE_RANK, ROW_NUMBER, and NTILE. Example: Rank employees by salary within each department.
SELECT employee_name, department_name, salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
FROM employees
- Common Table Expressions (CTEs): CTEs are a way to create a temporary result set that can be used within a SQL query. A CTE is defined using a WITH clause, and can be used to simplify complex queries by breaking them down into smaller, more manageable parts. Example: Find the top 5 departments with the highest average salary.
WITH department_salaries AS (
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
)
SELECT department_name, avg_salary
FROM department_salaries
ORDER BY avg_salary DESC
LIMIT 5
- Transactions: Transactions allow you to group a series of SQL statements together into a single unit of work. Transactions can be used to ensure that a series of changes to a database are all completed successfully, or to roll back changes if any of them fail. Transactions are typically used in multi-user environments or when modifying data in critical systems. Example: Transfer funds from one account to another within a transaction.
BEGIN;
UPDATE accounts SET balance = balance - 500 WHERE account_number = 123;
UPDATE accounts SET balance = balance + 500 WHERE account_number = 456;
COMMIT;
- Indexing: Indexing is a way to optimize SQL queries by creating special data structures that allow the database to retrieve data more quickly. Indexes can be created on one or more columns of a table, and can improve query performance by reducing the amount of data that needs to be searched. Example: Create an index on the “last_name” column of the “employees” table.
CREATE INDEX idx_last_name ON employees (last_name);
- Stored Procedures and Triggers: Stored procedures are sets of SQL statements that can be executed with a single command. Stored procedures can be used to automate common tasks or to enforce business rules within a database. Triggers are SQL code that is executed automatically in response to specific database events, such as a row being inserted, updated, or deleted. Example: Create a stored procedure to update an employee’s salary.
CREATE PROCEDURE update_employee_salary (IN employee_id INT, IN new_salary DECIMAL(10,2))
BEGIN
UPDATE employees SET salary = new_salary WHERE id = employee_id;
END;
Example: Create a trigger to update a department’s budget when an employee is added.
CREATE TRIGGER update_department_budget
AFTER INSERT ON employees
FOR EACH ROW
BEGIN
UPDATE departments SET budget = budget + 5000 WHERE id = NEW.department_id;
END;
- Materialized Views: Materialized views are similar to SQL views, but they store the data as a physical table in the database. Materialized views can be used to speed up queries on large datasets or for running reports that require complex calculations. Materialized views can be refreshed on a schedule or on demand, depending on the needs of the application. Example: Create a materialized view of the total sales by region and product.
CREATE MATERIALIZED VIEW sales_summary AS
SELECT region, product, SUM(sales) AS total_sales
FROM sales
GROUP BY region, product;
Example: Refresh the materialized view every night at midnight.
REFRESH MATERIALIZED VIEW sales_summary WITH DATA;