intro
In this blog, we’ll walk you through one of the most important clauses in the entire SQL lexicon—you will learn about the MySQL DISTINCT clause.
There’s no doubt — if you’re a developer using MySQL or any other database management system, you’ve encountered situations where you had to only select specific rows and select rows without duplicates, too. How did you do it? That’s right — you’ve made use of the MySQL DISTINCT
clause! Let’s dig into it, shall we?
SQL Query Types and Introduction to the MySQL DISTINCT Clause
SQL queries have a couple of distinct types, and they all perform different tasks: some insert data, some update, and some delete it. The primary type of SQL query you’re probably concerned with is a SELECT
—you already know the basics of its optimization; what you may not be aware of are the different types of a SELECT
that can be used within your infrastructure.
These days, if you find yourself modifying the SQL code written by pretty much any DBA or a database-conscious developer, you will see a SELECT DISTINCT * FROM demo_table
instead of plain SELECT * FROM
queries. Why? The answer is simple—the MySQL DISTINCT
clause only returns distinct, or unique, values.
That DISTINCT
isn’t there without reason. Since many tables contain duplicate values (talking about data breach search engines—there’s a good probability that if someone finds themselves in one data breach, they’re going to find himself in another one, too), the DISTINCT
clause is usually employed by people analyzing some data set to understand how many unique records were contained in a table or a column.
Understanding the MySQL DISTINCT Clause
In essence, a MySQL DISTINCT
clause eliminates all duplicates and provides only unique results.
Some may link it to the capabilities of a UNIQUE INDEX
and they wouldn’t be that far off. A unique index drops duplicates, while a DISTINCT
clause ignores and doesn’t select them. Makes sense?
Here’s how a simple DISTINCT
MySQL query look like when run in an SQL client (we use DbVisualizer when connected to a MySQL 8 database for this example):
Look at the results. We’ve selected a distinct result set but seemingly got two of the same product— both PS5’s. Confusing, yes? Not really. Look at our SQL LIKE
query again:
1
SELECT DISTINCT * FROM products WHERE product_name LIKE 'PS5%';
I’ve emboldened the part where you should pay more attention to. Got the point?
The reason MySQL returned two of the same results was because we’ve selected `DISTINCT instead of
DISTINCT column`.* You can’t expect MySQL to return a distinct result set if you select everything and there are two of the same values in one or more columns.
Now, we’ll refine our SQL query and try again. Our SQL query becomes the following:
1
SELECT DISTINCT product_name FROM products WHERE product_name LIKE 'PS5%';
The SQL query looks similar, but the results are now vastly different—only one row is returned. That’s not without a reason; this time we’ve selected a distinct column instead of selecting all distinct values. That’s one of the primary mistakes developers make when employing the SQL DISTINCT
clause—some DBAs are so used to what MySQL DISTINCT
does that they forget its core principles: first, you need to specify the column(s) you need to acquire distinct values from, and only then MySQL will do its magic.
Alternatives to the MySQL DISTINCT Clause
Aside from the DISTINCT
MySQL clause, there are a couple of other choices you may make to acquire only distinct, or different, values from your database instance. These choices include:
1
DELETE FROM products WHERE product_name IN ( SELECT product_name FROM products GROUP BY stock_amount HAVING COUNT(*) > 1 );
1
SELECT product_name FROM products GROUP BY stock_amount HAVING COUNT(*) > 1;
1
DELETE t1 FROM products AS t1 INNER JOIN products AS t2 WHERE t1.id < t2.id AND t1.product_name = t2.product_name;
Other Options
Another option to solve your database problems is to use database visualization tools for developers and data analysts, such as the one provided by DbVisualizer. We won’t make promises that if you use DbVisualizer your data won’t have any duplicates whatsoever, but it’s literally a gold mine for both developers and data analysts in that its SQL editor can easily format your queries, execute explain plans, automatically complete your queries or allow you to drag-and-drop the tables you want to query.
Here’s how to format queries:
DbVisualizer can do much more than just format your query buffers though—since it has a long history of being the preferred database client for both developers and data analysts, its feature set is truly impressive. We can categorize the feature set of DbVisualizer into a couple of distinct categories:
What are you waiting for? Grab a free trial now!
Frequently Asked Questions
What is the MySQL DISTINCT clause? What does it do?
The DISTINCT
clause in MySQL helps you get rid of duplicate values when selecting or updating data.
Are there alternatives to the DISTINCT MySQL clause?
Yes—you can use a SELECT
clause combined with an INTO OUTFILE
clause to put results in a text file, unique indexes to get rid of existing duplicate rows, or if you have a lot of data, make use of the uniq
clause in Linux. See the Alternatives heading in this blog for more information.
Why should I use DbVisualizer?
Consider using DbVisualizer because the tool is free to try and is used by leading tech companies across the globe. Our database tool can help you unleash the power in your database instances by helping you visualize your data, help you create monitors to query your databases in set intervals, and much more.