intro
Let’s dig into PostgreSQL common table expressions, learning what they are, their pros and cons, and how they can help you write more readable queries.
To extract the data of interest from a PostgreSQL database, you may need to run several queries. This means executing them one at a time at the application level, resulting in high overhead. SQL subqueries can help, but they have some major limitations in terms of reusability. This is where a WITH
query with the right CTEs can save you!
In this article, you will look at the definition of a PostgreSQL CTE, how to use WITH
queries, and some examples of CTEs.
What Is a PostgreSQL CTE?
A PostgreSQL CTE (Common Table Expression) is a temporary result set that can be referenced within another SQL query. It allows users to create named subqueries that can be used as tables within SELECT
, INSERT
, UPDATE
, or DELETE
queries. In other words, it is a mechanism to provide more flexibility and readability to complex queries.
How to Write a Common Table Expression in PostgreSQL
In PostgreSQL, a CTE is defined through the WITH clause. Each auxiliary CTE statement consists of two elements:
Here is an example of the syntax to define a PostgreSQL common table expression:
1
WITH cte_name AS (
2
SELECT column1, column2, ..., columnN
3
FROM table_name
4
WHERE cte_condition
5
)
6
SELECT *
7
FROM cte_name
8
WHERE query_condition;
cte_name
is the name given to the CTE, which is defined by the query inside the parentheses. The subsequent SELECT
statement uses the CTE as if it was a table, allowing further operations to be executed on that data. This is possible because the result set produced by the CTE can be referenced in the main query by its name as any other table or view.
The same WITH
clause can involve several CTEs, each with its name and specification query:
1
WITH cte1 AS (
2
SELECT cte1_column1, cte1_column2, ..., cte1_columnN
3
FROM table1
4
WHERE condition1
5
),
6
...
7
WITH cteN AS (
8
SELECT cteN_column1, cteN_column2, ..., cte1_columnM
9
FROM tableN
10
WHERE conditionN
11
)
12
SELECT *
13
FROM tableX
14
WHERE condition_about_ct1, ..., condition_about_cteN
The external SELECT
can use data from any CTE result set in its clauses. For example, it can employ such data in WHERE
conditions to apply special filters.
Note that the queries above are just examples. Each CTE can be a SELECT
, INSERT
, UPDATE
, or DELETE
. Similarly, the main query itself can be a SELECT
, INSERT
, UPDATE
, or DELETE
.
Pros and Cons of WITH Queries
Let’s take a look at the most important benefits and drawbacks of WITH queries in PostgreSQL.
Pros
Cons
PostgreSQL CTE: Examples
Let's explore three SQL examples of common table expressions in PostgreSQL to understand how they can come in handy in real-world scenarios.
Example 1: Getting an Organization’s Hierarchy
Assume your employees
table has the following columns: id
, complete_name
, and manager_id
. You want to retrieve the organizational hierarchy under a specific employee.
Here is how you can get that data using a recursive CTE:
1
WITH RECURSIVE employee_hierarchy AS (
2
SELECT id, complete_name, manager_id
3
FROM employees
4
WHERE id = 1
5
UNION ALL
6
SELECT E.id, E.complete_name, E.manager_id
7
FROM employees E
8
JOIN employee_hierarchy EH ON EH.id = E.manager_id
9
)
10
SELECT *
11
FROM employee_hierarchy;
In this example, the CTE employee_hierarchy
selects the first employee based on their ID. Then it recursively joins the partial result set with the employees table to retrieve the employees
managed by the current manager, continuing traversing up the hierarchy until the entire sub-hierarchy is fetched. This is a common query pattern to traverse a tree in PostgreSQL.
Note that you could not achieve this result with a simple SELECT
. This is because RECURSIVE
makes the query repeatedly execute and combine the result of the CTE. In detail, the query will loop over the CTE, applying the query logic again and again until there are no more elements to iterate over. You cannot do that with a SELECT
.
Example 2: Getting the Department with the Highest Average Salary
Assume you have an employees table with columns like id
, department_id
, and salary
. You want to find out what department has the highest average salary.
You can obtain that info with a WITH
clause as below:
1
WITH department_avg_salary AS (
2
SELECT department_id, AVG(salary) AS avg_salary
3
FROM employees
4
GROUP BY department_id
5
)
6
SELECT D.name, DAS.avg_salary
7
FROM department_avg_salary DAS
8
JOIN departments D ON D.id = DAS.department_id
9
ORDER BY DAS.avg_salary
10
LIMIT 1;
In this example, the CTE department_avg_salary
calculates the average salary for each department. The main query then displays the department that earns the most.
In this case, you could get that info also with a complex SELECT
query.
Example 3: Retrieving per-product sales in top regions
Now you want to get sales totals by product only in the most important sales regions. Use a WITH
clause involving two auxiliary sub-queries as follows:
1
2
3
WITH regional_sales AS (
4
SELECT region_id, SUM(price_amount) AS total_sales
5
FROM orders
6
GROUP BY region_id
7
), top_regions AS (
8
SELECT region_id
9
FROM regional_sales
10
WHERE total_sales > (SELECT SUM(total_sales)/3 FROM regional_sales)
11
)
12
SELECT region,
13
product,
14
SUM(quantity) AS product_units,
15
SUM(price_amount) AS product_sales
16
FROM orders
17
WHERE region_id IN (SELECT region_id FROM top_regions)
18
GROUP BY region, product;
The output of the first CTE is used in the second to produce the top_regions
result set. Then the output of top_regions
gets employed in the WHERE
condition of the primary SELECT
query to get the desired data.
Conclusion
In this article, you saw that PostgreSQL's CTEs (Common Table Expressions) provide a powerful tool for breaking down complex queries, improving code organization and facilitating code reuse. In particular, you got to understand the syntax of the WITH
query and the benefits of leveraging CTEs.
The main problem with this feature is that it can lead to non-optimal, slow, resource-intensive queries. Here is where DbVisualizer comes into play! In addition to the most common features of a database client and support for dozens of DBMSs, this tool offers advanced query optimization capabilities that will help you take your PostgreSQL CTEs to the next level. Download DbVisualizer for free now!
FAQ
Let’s answer some interesting questions.
How does a CTE in PostgreSQL differ from a regular subquery?
They both enable you to structure complex queries. However, CTEs generate named temporary result sets that can be referenced multiple times within a query. Subqueries, on the other hand, are embedded within a larger query and cannot be referenced separately since they do not have a name.
Can multiple CTEs be nested within the same PostgreSQL query?
Yes, multiple CTEs can be nested within the same PostgreSQL query. WITH
supports the definition of several common table expressions, which can then be referenced once or multiple times in the main query.
Is it possible to update or delete data using CTEs in PostgreSQL?
Yes, WITH
queries supports UPDATE
or DELETE
statements.
Can a CTE be referenced multiple times in the same query?
Yes, a CTE can be referenced multiple times within the same PostgreSQL query. You can use it both in other CTEs and in the main query.
Are PostgreSQL CTEs only limited to SELECT statements?
No, PostgreSQL CTEs can be used with other DML (Data Manipulation Language) statements. In addition to SELECT
, they also support INSERT
, UPDATE
, and DELETE
. This makes them a flexible tool to perform complex operations on existing or new data.