intro

In this guide, we will delve into the concept of the DISTINCT clause in PostgreSQL and explore how it allows you to stamp out duplicate records, enabling you to focus on distinct values across various database tables - let’s get started.

Tools used in the tutorial

DBVISUALIZER

TOP RATED DATABASE MANAGEMENT TOOL AND SQL CLIENT

DOWNLOAD

POSTGRESQL

THE POSTGRESQL DATABASE

POSTGRESQL

Preface

Have you ever encountered duplicate values in your database as a database developer? These are common problems database developers run into in their lives as developers.

PostgreSQL solves this issue by enabling developers to rule out duplicate values from query results using the PostgreSQL DISTINCT clause. This is significant in ensuring error-free data analysis and reporting.

This article provides a comprehensive guide to the syntax, usage, and performance considerations of using the PostgreSQL DISTINCT clause. We’re going to discuss the importance of using this feature and by the end of this article, readers will have a solid understanding of how to remove duplicate rows from a result set.

What is the PostgreSQL SELECT DISTINCT Clause?

The SELECT DISTINCT clause removes duplicate records in a result set by keeping one row, thus the first row or group of duplicate result sets. Duplicate results are basically values or rows that appear more than once in a result set.

The DISTINCT keyword is used in synchrony with the SELECT statement in many database management systems including PostgreSQL and the syntax is as follows:

Copy

        
        
                            
                    1
                    SELECT
                
                            
                    2
                       DISTINCT column x, column y
                
                            
                    3
                    FROM
                
                            
                    4
                       table_name;

Now, let us analyze the syntax. The columns x and y in the syntax above represent the particular column from which you would want to remove duplicate results. The DISTINCT keyword will then operate on the particular column(s) in question and then filter out the duplicate to produce a result set that will contain a unique combination of values from those columns, in this case, columns x and y.

So in the syntax above, the duplicate values will be evaluated based on the values in columns x and y.

As we have said earlier in this blog, PostgreSQL also removes duplicate results by keeping the first row of each row of duplicates found within the result set and it follows the syntax shown below in DbVisualizer:

PostgreSQL DISTINCT ON Syntax.

Let us analyze the syntax:

SELECT DISTINCT ON (column1) column_alias, column2 specifies the columns to be selected in the query result. The DISTINCT ON clause retrieves distinct rows based on the values in column1. The column_alias is an optional alias that can be assigned to the selected column, providing a customized name for the result.
FROM table_name defines the table which the data is being queried from.
ORDER BY column1, column2 defines the ordering of the rows in the query result. The ORDER BY clause is used to sort the rows based on the values in column1 and column2.

According to a rule in Postgres, the expression used in the DISTINCT ON clause (in this case, column1) must also be the leftmost expression in the ORDER BY clause. To rephrase it, it should be listed before any other columns in the ORDER BY clause. Postgres will throw an error if the leftmost expression in the DISTINCT ON clause does not match the leftmost expression in the ORDER BY clause.

This rule ensures that PostgreSQL correctly identifies the distinct rows based on the leftmost expression and orders the rows accordingly. Let us now look at some specific use cases of the DISTINCT clause in PostgreSQL.

The PostgreSQL DISTINCT Clause - Use Cases

Now, we’re going to look into a few use cases that will help us grasp a holistic understanding of this concept but before that, let us make sure the prerequisites are ready. We will need to:

Create a database
Create a table with columns inside of it
Insert some data into the table

Let us now create a database called ‘closet’ with the command below:

Copy

        
        
                            
                    1
                    CREATE DATABASE closet;

Run the command above in DbVisualizer, navigate to the left pane, and locate the newly created database as shown below:

The Created Database.

Next, let us create a table with the command below:

Creating a Table with Columns.

The next thing to do is to insert some data into the created table. We can do this by running the query below in the SQL commander of DbVisualizer:

Inserting Values into the Created Table.

Run the query below in the SQL Commander of DbVisualizer to know whether everything is intact:

Running Query in DbVisualizer.

Great! We can see from the result set above that everything is intact. Let us look at the use cases in the following examples:

1) Using PostgreSQL SELECT DISTINCT Clause on a Single Column

The SELECT DISTINCT clause can be used on a single column to select unique values from that column and then produce a result set that is arranged alphabetically due to the presence of the ORDER BY clause. Let us look at a typical scenario as shown below:

Single Column Use Case in DbVisualizer.

2) Using the SELECT DISTINCT Clause on Multiple Columns

As we have said earlier in this blog, if we specify two columns, Postgres combines the values within them and produces a result set with a unique combination from the two columns. The same thing happens here: when we specify the two columns cloth1 and cloth2 with the SELECT DISTINCT clause, Postgres will combine the values in both columns to analyze the uniqueness of the rows.

For example, let us build a query that will return a unique combination of cloth1 and cloth2 from the closet_clothes table. Keep in mind that the closet_clothes table has two rows with velvet values in both cloth1 and cloth2. Now, when we apply the DISTINCT to both columns, one row will be removed from the result set because it is clearly a duplicate.

Multiple Column Use Case in DbVisualizer.

Awesome! We can see that Postgres has combined the values in both columns to analyze the uniqueness of the rows and has produced a result set with no duplicates.

Conclusion

The PostgreSQL DISTINCT keyword is a powerful tool for handling duplicate values in query results. We explored the usage of the SELECT DISTINCT clause on a single column, which allows us to obtain unique values from that column. This is particularly useful when we want to retrieve a list of unique values for filtering, or analysis purposes.

To better appreciate its capabilities, you need a tool that helps you manage databases and visually explore query results. This is where a full-featured database client like DbVisualizer comes in. In addition to being able to connect to several DBMSs, it offers advanced query optimization functionality, and full support for all database features, including DISTINCT. Download DbVisualizer for free now!

FAQs

What is the purpose of the DISTINCT keyword in PostgreSQL?

It is used to retrieve unique values from a column or a combination of columns to return a result set that is free of duplicate values.

How do I use the DISTINCT keyword in a SELECT statement?

Include it immediately after the SELECT keyword in your built query. For example:

Copy

        
        
                            
                    1
                    SELECT DISTINCT column_name FROM table_name;

This will retrieve the unique values from the specified column.

Can I use the DISTINCT keyword with multiple columns?

Yes, you can use the DISTINCT keyword with multiple columns. In such cases, the combination of values from all the specified columns will be considered when evaluating uniqueness.

Does using DISTINCT affect the order of the query results?

No, using the DISTINCT keyword does not affect the order of the query results. It focuses on eliminating duplicate records in a result set. ORDER BY is what is responsible for imposing a specific order.

For example, let us assume that we have a table called developers with the following data:

The ‘Developers’ Table.

Let’s say, from the table shown above, we want to retrieve a well-defined list of the names of departments from the developers table. To show that the DISTINCT keyword does not affect the order of the query results, let us use the DISTINCT keyword in the SQL statement below:

Copy

        
        
                            
                    1
                    SELECT DISTINCT department from developers;

The result of this query is shown below in DbVisualizer:

Applying the DISTINCT keyword on the ‘Developers’ Table.

From the query result shown, we can see that the DISTINCT keyword warrants that only different department names are returned in the result, getting rid of duplicates completely.

Please take notice that there is no assurance as to the developers' order in the result set and this can be attributed to the nature of Postgres.

The ORDER BY clause must be used if you want to force a certain order on the query results. For instance, you may change the query to the query below if you want the departments to be listed in ascending order alphabetically:

Copy

        
        
                            
                    1
                    SELECT DISTINCT department FROM developers ORDER BY department ASC;

The updated query result would then be:

Query result in ascending order.

Here, the ORDER BY clause has determined the order of the query results.

PostgreSQL DISTINCT: Removing Duplicate Rows from a Result Set

Preface

What is the PostgreSQL SELECT DISTINCT Clause?

The PostgreSQL DISTINCT Clause - Use Cases

1) Using PostgreSQL SELECT DISTINCT Clause on a Single Column

2) Using the SELECT DISTINCT Clause on Multiple Columns

Conclusion

FAQs

What is the purpose of the DISTINCT keyword in PostgreSQL?

How do I use the DISTINCT keyword in a SELECT statement?

Can I use the DISTINCT keyword with multiple columns?

Does using DISTINCT affect the order of the query results?

SQL DISTINCT: A Comprehensive Guide

SQL DROP TABLE IF EXISTS: The Database Migration Lifesaver

Default PostgreSQL Port 5432: Configure, Verify, and Change Your Postgres Port

Using the MySQL CREATE TABLE Statement: A Complete Guide

PostgreSQL OID: A Guide on The Object Identifier Type

Top Internal Tool Builders and Low-Code Platforms for SQL Apps in 2025

How to Protect Your Database From the Threats Outlined in OWASP Top 10?

Postgres CREATE SCHEMA: A Complete Guide

SQL Server CHARINDEX Function (Find Substring Position)

pg_dumpall: How to Dump All Your PostgreSQL Databases

Product

Industries

Legal

Company

Contact