intro

Let’s dig into PostgreSQL common table expressions, learning what they are, their pros and cons, and how they can help you write more readable queries.

Tools used in the tutorial

DBVISUALIZER

TOP RATED DATABASE MANAGEMENT TOOL AND SQL CLIENT

DOWNLOAD

POSTGRESQL

THE POSTGRESQL DATABASE

POSTGRESQL

To extract the data of interest from a PostgreSQL database, you may need to run several queries. This means executing them one at a time at the application level, resulting in high overhead. SQL subqueries can help, but they have some major limitations in terms of reusability. This is where a WITH query with the right CTEs can save you!

In this article, you will look at the definition of a PostgreSQL CTE, how to use WITH queries, and some examples of CTEs.

What Is a PostgreSQL CTE?

A PostgreSQL CTE (Common Table Expression) is a temporary result set that can be referenced within another SQL query. It allows users to create named subqueries that can be used as tables within SELECT, INSERT, UPDATE, or DELETE queries. In other words, it is a mechanism to provide more flexibility and readability to complex queries.

How to Write a Common Table Expression in PostgreSQL

In PostgreSQL, a CTE is defined through the WITH clause. Each auxiliary CTE statement consists of two elements:

A name
A query

Here is an example of the syntax to define a PostgreSQL common table expression:

Copy

        
        
                            
                    1
                    WITH cte_name AS (
                
                            
                    2
                        SELECT column1, column2, ..., columnN
                
                            
                    3
                        FROM table_name
                
                            
                    4
                        WHERE cte_condition
                
                            
                    5
                    )
                
                            
                    6
                    SELECT *
                
                            
                    7
                    FROM cte_name
                
                            
                    8
                    WHERE query_condition;

cte_name is the name given to the CTE, which is defined by the query inside the parentheses. The subsequent SELECT statement uses the CTE as if it was a table, allowing further operations to be executed on that data. This is possible because the result set produced by the CTE can be referenced in the main query by its name as any other table or view.

The same WITH clause can involve several CTEs, each with its name and specification query:

Copy

        
        
                            
                    1
                    WITH cte1 AS (
                
                            
                    2
                        SELECT cte1_column1, cte1_column2, ..., cte1_columnN
                
                            
                    3
                        FROM table1
                
                            
                    4
                        WHERE condition1
                
                            
                    5
                    ),
                
                            
                    6
                    ...
                
                            
                    7
                    WITH cteN AS (
                
                            
                    8
                        SELECT cteN_column1, cteN_column2, ..., cte1_columnM
                
                            
                    9
                        FROM tableN
                
                            
                    10
                        WHERE conditionN
                
                            
                    11
                    )
                
                            
                    12
                    SELECT *
                
                            
                    13
                    FROM tableX
                
                            
                    14
                    WHERE condition_about_ct1, ..., condition_about_cteN

The external SELECT can use data from any CTE result set in its clauses. For example, it can employ such data in WHERE conditions to apply special filters.

Note that the queries above are just examples. Each CTE can be a SELECT, INSERT, UPDATE, or DELETE. Similarly, the main query itself can be a SELECT, INSERT, UPDATE, or DELETE.

Pros and Cons of WITH Queries

Let’s take a look at the most important benefits and drawbacks of WITH queries in PostgreSQL.

Pros

Improved Readability: Thanks to the WITH statement, you can break down complex queries into smaller parts. This enhances query readability and makes it easier to understand and maintain SQL queries.
Enhanced Reusability: The same CTE can be reused multiple times within the same query, eliminating the need to repeat the same sub-query logic in the main query.
Recursive Capabilities: PostgreSQL supports recursive CTEs, enabling users to accomplish tasks not otherwise possible in standard SQL. That is useful for handling recursive data or operations, such as traversing tree-like structures.

Cons

Optimization issues: WITH queries are usually trickier to optimize compared to equivalent queries without CTEs. To optimize them, it is essential to carefully analyzing the execution plan.
Limited Scope: CTEs only exist within the context of the WITH clause in which they are defined. They cannot be referenced outside that query.
Memory Usage: WITH queries may consume more resources compared to alternative query structures. This is especially true when dealing with executing multiple CTEs simultaneously on large datasets.

PostgreSQL CTE: Examples

Let's explore three SQL examples of common table expressions in PostgreSQL to understand how they can come in handy in real-world scenarios.

Example 1: Getting an Organization’s Hierarchy

Assume your employees table has the following columns: id, complete_name, and manager_id. You want to retrieve the organizational hierarchy under a specific employee.

Here is how you can get that data using a recursive CTE:

Copy

        
        
                            
                    1
                    WITH RECURSIVE employee_hierarchy AS (
                
                            
                    2
                        SELECT id, complete_name, manager_id
                
                            
                    3
                        FROM employees
                
                            
                    4
                        WHERE id = 1
                
                            
                    5
                        UNION ALL
                
                            
                    6
                        SELECT E.id, E.complete_name, E.manager_id
                
                            
                    7
                        FROM employees E
                
                            
                    8
                            JOIN employee_hierarchy EH ON EH.id = E.manager_id
                
                            
                    9
                    )
                
                            
                    10
                    SELECT *
                
                            
                    11
                    FROM employee_hierarchy;

In this example, the CTE employee_hierarchy selects the first employee based on their ID. Then it recursively joins the partial result set with the employees table to retrieve the employees managed by the current manager, continuing traversing up the hierarchy until the entire sub-hierarchy is fetched. This is a common query pattern to traverse a tree in PostgreSQL.

Executing the WITH query in DbVisualizer.

Note that you could not achieve this result with a simple SELECT. This is because RECURSIVE makes the query repeatedly execute and combine the result of the CTE. In detail, the query will loop over the CTE, applying the query logic again and again until there are no more elements to iterate over. You cannot do that with a SELECT.

Example 2: Getting the Department with the Highest Average Salary

Assume you have an employees table with columns like id, department_id, and salary. You want to find out what department has the highest average salary.

You can obtain that info with a WITH clause as below:

Copy

        
        
                            
                    1
                    WITH department_avg_salary AS (
                
                            
                    2
                        SELECT department_id, AVG(salary) AS avg_salary
                
                            
                    3
                        FROM employees
                
                            
                    4
                        GROUP BY department_id
                
                            
                    5
                    )
                
                            
                    6
                    SELECT D.name, DAS.avg_salary
                
                            
                    7
                    FROM department_avg_salary DAS
                
                            
                    8
                        JOIN departments D ON D.id = DAS.department_id
                
                            
                    9
                    ORDER BY DAS.avg_salary
                
                            
                    10
                    LIMIT 1;

In this example, the CTE department_avg_salary calculates the average salary for each department. The main query then displays the department that earns the most.

The "Finance" department is the one that makes the most.

In this case, you could get that info also with a complex SELECT query.

Example 3: Retrieving per-product sales in top regions

Now you want to get sales totals by product only in the most important sales regions. Use a WITH clause involving two auxiliary sub-queries as follows:

Copy

        
        
                            
                    1
                    
                
                            
                    2
                    
                
                            
                    3
                    WITH regional_sales AS (
                
                            
                    4
                        SELECT region_id, SUM(price_amount) AS total_sales
                
                            
                    5
                        FROM orders
                
                            
                    6
                        GROUP BY region_id
                
                            
                    7
                    ), top_regions AS (
                
                            
                    8
                        SELECT region_id
                
                            
                    9
                        FROM regional_sales
                
                            
                    10
                        WHERE total_sales > (SELECT SUM(total_sales)/3 FROM regional_sales)
                
                            
                    11
                    )
                
                            
                    12
                    SELECT region,
                
                            
                    13
                        product,
                
                            
                    14
                        SUM(quantity) AS product_units,
                
                            
                    15
                        SUM(price_amount) AS product_sales
                
                            
                    16
                    FROM orders
                
                            
                    17
                    WHERE region_id IN (SELECT region_id FROM top_regions)
                
                            
                    18
                    GROUP BY region, product;

The output of the first CTE is used in the second to produce the top_regions result set. Then the output of top_regions gets employed in the WHERE condition of the primary SELECT query to get the desired data.

Writing the query multiple CTEs in DbVisualizer.

Conclusion

In this article, you saw that PostgreSQL's CTEs (Common Table Expressions) provide a powerful tool for breaking down complex queries, improving code organization and facilitating code reuse. In particular, you got to understand the syntax of the WITH query and the benefits of leveraging CTEs.

The main problem with this feature is that it can lead to non-optimal, slow, resource-intensive queries. Here is where DbVisualizer comes into play! In addition to the most common features of a database client and support for dozens of DBMSs, this tool offers advanced query optimization capabilities that will help you take your PostgreSQL CTEs to the next level. Download DbVisualizer for free now!

FAQ

Let’s answer some interesting questions.

How does a CTE in PostgreSQL differ from a regular subquery?

They both enable you to structure complex queries. However, CTEs generate named temporary result sets that can be referenced multiple times within a query. Subqueries, on the other hand, are embedded within a larger query and cannot be referenced separately since they do not have a name.

Can multiple CTEs be nested within the same PostgreSQL query?

Yes, multiple CTEs can be nested within the same PostgreSQL query. WITH supports the definition of several common table expressions, which can then be referenced once or multiple times in the main query.

Is it possible to update or delete data using CTEs in PostgreSQL?

Yes, WITH queries supports UPDATE or DELETE statements.

Can a CTE be referenced multiple times in the same query?

Yes, a CTE can be referenced multiple times within the same PostgreSQL query. You can use it both in other CTEs and in the main query.

Are PostgreSQL CTEs only limited to SELECT statements?

No, PostgreSQL CTEs can be used with other DML (Data Manipulation Language) statements. In addition to SELECT, they also support INSERT, UPDATE, and DELETE. This makes them a flexible tool to perform complex operations on existing or new data.

PostgreSQL CTE: What It Is and How to Use It

What Is a PostgreSQL CTE?

How to Write a Common Table Expression in PostgreSQL

Pros and Cons of WITH Queries

Pros

Cons

PostgreSQL CTE: Examples

Example 1: Getting an Organization’s Hierarchy

Example 2: Getting the Department with the Highest Average Salary

Example 3: Retrieving per-product sales in top regions

Conclusion

FAQ

How does a CTE in PostgreSQL differ from a regular subquery?

Can multiple CTEs be nested within the same PostgreSQL query?

Is it possible to update or delete data using CTEs in PostgreSQL?

Can a CTE be referenced multiple times in the same query?

Are PostgreSQL CTEs only limited to SELECT statements?

SQL DROP TABLE IF EXISTS: The Database Migration Lifesaver

Default PostgreSQL Port 5432: Configure, Verify, and Change Your Postgres Port

PostgreSQL OID: A Guide on The Object Identifier Type

Postgres CREATE SCHEMA: A Complete Guide

pg_dumpall: How to Dump All Your PostgreSQL Databases

How to Compare Datetimes in SQL: Multiple Approaches

The Complete Guide to CONCAT in SQL Queries: Syntax, Examples, and Best Practices

How Often Should SQL Transaction Logs Be Backed Up?

What Is a Database Catalog?

Check If A Table Exists in SQL: Multiple Approaches

Product

Industries

Legal

Company

Contact