BEGINNER
MySQL
SQL

MySQL DISTINCT Clause Explained: Why, How & When

intro

In this blog, we’ll walk you through one of the most important clauses in the entire SQL lexicon—you will learn about the MySQL DISTINCT clause.

Tools used in the tutorial
Tool Description Link
Dbvisualizer DBVISUALIZER
TOP RATED DATABASE MANAGEMENT TOOL AND SQL CLIENT
MySQL logo MySQL
THE MYSQL DATABASE

There’s no doubt — if you’re a developer using MySQL or any other database management system, you’ve encountered situations where you had to only select specific rows and select rows without duplicates, too. How did you do it? That’s right — you’ve made use of the MySQL DISTINCT clause! Let’s dig into it, shall we?

SQL Query Types and Introduction to the MySQL DISTINCT Clause

SQL queries have a couple of distinct types, and they all perform different tasks: some insert data, some update, and some delete it. The primary type of SQL query you’re probably concerned with is a SELECTyou already know the basics of its optimization; what you may not be aware of are the different types of a SELECT that can be used within your infrastructure.

These days, if you find yourself modifying the SQL code written by pretty much any DBA or a database-conscious developer, you will see a SELECT DISTINCT * FROM demo_table instead of plain SELECT * FROM queries. Why? The answer is simple—the MySQL DISTINCT clause only returns distinct, or unique, values.

That DISTINCT isn’t there without reason. Since many tables contain duplicate values (talking about data breach search engines—there’s a good probability that if someone finds themselves in one data breach, they’re going to find himself in another one, too), the DISTINCT clause is usually employed by people analyzing some data set to understand how many unique records were contained in a table or a column.

Understanding the MySQL DISTINCT Clause

In essence, a MySQL DISTINCT clause eliminates all duplicates and provides only unique results.

Some may link it to the capabilities of a UNIQUE INDEX and they wouldn’t be that far off. A unique index drops duplicates, while a DISTINCT clause ignores and doesn’t select them. Makes sense?

Here’s how a simple DISTINCT MySQL query look like when run in an SQL client (we use DbVisualizer when connected to a MySQL 8 database for this example):

Simple MySQL DISTINCT Query Example in DbVisualizer
Simple MySQL DISTINCT Query Example in DbVisualizer

Look at the results. We’ve selected a distinct result set but seemingly got two of the same product— both PS5’s. Confusing, yes? Not really. Look at our SQL LIKE query again:

Copy
        
1 SELECT DISTINCT * FROM products WHERE product_name LIKE 'PS5%';

I’ve emboldened the part where you should pay more attention to. Got the point?

The reason MySQL returned two of the same results was because we’ve selected `DISTINCT instead ofDISTINCT column`.* You can’t expect MySQL to return a distinct result set if you select everything and there are two of the same values in one or more columns.

Now, we’ll refine our SQL query and try again. Our SQL query becomes the following:

Copy
        
1 SELECT DISTINCT product_name FROM products WHERE product_name LIKE 'PS5%';
Modified MySQL DISTINCT Example Using DbVisualizer
Modified MySQL DISTINCT Example Using DbVisualizer

The SQL query looks similar, but the results are now vastly different—only one row is returned. That’s not without a reason; this time we’ve selected a distinct column instead of selecting all distinct values. That’s one of the primary mistakes developers make when employing the SQL DISTINCTclause—some DBAs are so used to what MySQL DISTINCT does that they forget its core principles: first, you need to specify the column(s) you need to acquire distinct values from, and only then MySQL will do its magic.

Alternatives to the MySQL DISTINCT Clause

Aside from the DISTINCT MySQL clause, there are a couple of other choices you may make to acquire only distinct, or different, values from your database instance. These choices include:

  • A UNIQUE key or a UNIQUE index: These kinds of indexes are built in such a way that ensures that all the rows in a table are unique (there are no duplicates). That means if such an index is in place on your table, your table won’t have any duplicate rows inside of it, and you won’t even need the DISTINCT clause!
  • Use a DELETE clause: You saw this one coming, yeah? Obviously, you can delete duplicate rows if you make use of the DELETE and GROUP BY clauses all at once like so (note that the query given below is left unformatted and DbVisualizer has formatted the query for us—that’s the formatting capability we’ll get into later):
Copy
        
1 DELETE FROM products WHERE product_name IN ( SELECT product_name FROM products GROUP BY stock_amount HAVING COUNT(*) > 1 );
DELETE Query with a GROUP BY Clause
DELETE Query with a GROUP BY Clause
  • Use a SELECT clause combined with INTO OUTFILE or other clauses: If you only want to select duplicate rows, you can do the same thing as with a DELETE with a SELECT (you may also want to add an INTO OUTFILE clause if you want to export the whole thing into a file):
Copy
        
1 SELECT product_name FROM products GROUP BY stock_amount HAVING COUNT(*) > 1;
SELECT Query for Duplicate Values
SELECT Query for Duplicate Values
  • Use a DELETE clause with an INNER JOIN to remove duplicates from a table with a self-join: In this case, we delete duplicate product_names from the table products (id is a unique column):
Copy
        
1 DELETE t1 FROM products AS t1 INNER JOIN products AS t2 WHERE t1.id < t2.id AND t1.product_name = t2.product_name;
Removing Duplicates with an INNER JOIN
Removing Duplicates with an INNER JOIN
  • Linux users can use sort with a unique flag: If you have many rows (we’re talking >100,000,000 or more), running a MySQL DISTINCT query may not be the best choice. Instead, why not export your data to a text file using SELECT * FROM your_table INTO OUTFILE file.txt and then use sort -u file.txt on Linux? This command will be significantly faster than anything else you can think of and would get rid of hundreds of millions of duplicates in minutes. Use it!

Other Options

Another option to solve your database problems is to use database visualization tools for developers and data analysts, such as the one provided by DbVisualizer. We won’t make promises that if you use DbVisualizer your data won’t have any duplicates whatsoever, but it’s literally a gold mine for both developers and data analysts in that its SQL editor can easily format your queries, execute explain plans, automatically complete your queries or allow you to drag-and-drop the tables you want to query.

Here’s how to format queries:

Formatting Queries with DbVisualizer
Formatting Queries with DbVisualizer

DbVisualizer can do much more than just format your query buffers though—since it has a long history of being the preferred database client for both developers and data analysts, its feature set is truly impressive. We can categorize the feature set of DbVisualizer into a couple of distinct categories:

  • SQL client: It’s not news that DbVisualizer has a powerful SQL client. Aside from helping you format queries, it also helps you to build queries by dragging and dropping your tables inside of the tool, and can do many other things in this realm!
  • Data visualization: Of course, what good database client would be without data visualization capabilities? DbVisualizer is capable of building ERD schemas helping you visualize the data you already have, comes with inline editing capabilities, and you can also export data sets for further analysis. See all DbVisualizer features.
  • An assistant for developers and data analysts: DbVisualizer has everything you need to build and manage top-notch database technologies. It can even help you execute scripts from inside the tool! How cool is that?

What are you waiting for? Grab a free trial now!

Frequently Asked Questions

What is the MySQL DISTINCT clause? What does it do?

The DISTINCT clause in MySQL helps you get rid of duplicate values when selecting or updating data.

Are there alternatives to the DISTINCT MySQL clause?

Yes—you can use a SELECT clause combined with an INTO OUTFILE clause to put results in a text file, unique indexes to get rid of existing duplicate rows, or if you have a lot of data, make use of the uniq clause in Linux. See the Alternatives heading in this blog for more information.

Why should I use DbVisualizer?

Consider using DbVisualizer because the tool is free to try and is used by leading tech companies across the globe. Our database tool can help you unleash the power in your database instances by helping you visualize your data, help you create monitors to query your databases in set intervals, and much more.

Dbvis download link img
About the author
LukasVileikisPhoto
Lukas Vileikis
Lukas Vileikis is an ethical hacker and a frequent conference speaker. He runs one of the biggest & fastest data breach search engines in the world - BreachDirectory.com, frequently speaks at conferences and blogs in multiple places including his blog over at lukasvileikis.com.
The Table Icon
Sign up to receive The Table's roundup
More from the table
Title Author Tags Length Published
title

A Guide To the SQL DECLARE Statement

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SERVER 7 min 2024-12-12
title

SQL Upsert: Inserting a Record If It Does Not Exist

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 8 min 2024-12-10
title

A Complete Guide to the MySQL COUNT Function

author Antonello Zanini tags MySQL 9 min 2024-12-09
title

MySQL Nullable Columns: Everything You Need to Know

author Antonello Zanini tags MySQL 6 min 2024-12-03
title

A Complete Guide to the MySQL GROUP_CONCAT Function

author Antonello Zanini tags Data aggregation MySQL 8 min 2024-12-02
title

How and Why to Add an Index in SQL

author Lukas Vileikis tags MySQL POSTGRESQL Search SQL SQL SERVER 8 min 2024-11-28
title

MySQL 9.0 and the New EXPLAIN Features

author Lukas Vileikis tags MySQL SQL 9 min 2024-11-27
title

Massive SQL DELETE Operations and Database Purging Explained

author Lukas Vileikis tags DELETE SQL 4 min 2024-11-26
title

MySQL CAST: A Guide to the Data Conversion Function

author Antonello Zanini tags MySQL 3 min 2024-11-25
title

Postgres Create Array of Numbers: Complete Guide

author Lukas Vileikis tags POSTGRESQL SQL 7 min 2024-11-21

The content provided on dbvis.com/thetable, including but not limited to code and examples, is intended for educational and informational purposes only. We do not make any warranties or representations of any kind. Read more here.

Cookie policy

We use cookies to ensure that we give you the best experience on our website. However you can change your cookie settings at any time in your browser settings. Please find our cookie policy here ↗