MySQL
SQL

SUBSTRING_INDEX in SQL Explained: A Guide

intro

Want to return a substring of a string? Look no further! The SUBSTRING_INDEX function is just what you’re looking for — read this blog and find out why.

Tools used in the tutorial
Tool Description Link
Dbvisualizer DBVISUALIZER
TOP RATED DATABASE MANAGEMENT TOOL AND SQL CLIENT
mariadb MARIADB
THE MARIADB DATABASE
MySQL logo MySQL
THE MYSQL DATABASE

Many DBAs dealing with all kinds of database management systems find themselves in situations where they have to obtain a substring of a string—say, everything preceding or after a certain character, etc. Here’s where the SUBSTRING_INDEX in SQL comes in—it’s a very powerful, but often overlooked function providing us with the ability to return a substring after a certain index, hence the name.

In this article, you will learn what that SQL function is, how it works, and when to use it!

How Does SUBSTRING_INDEX Work?

The [SUBSTRING_INDEX in SQL](https://www.dbvis.com/thetable/a-complete-guide-to-substring_index-in-sql/) function is self-explanatory. It returns a substring of a string based on a given index. In other words, the SQL SUBSTRING_INDEX function returns a part of a string after a certain character.

Take this example into account:

Copy
        
1 SELECT SUBSTRING_INDEX("https://dbvis.com", ".", 2);

The result will be:

SUBSTRING_INDEX in SQL in DbVisualizer — Example
SUBSTRING_INDEX in SQL in DbVisualizer u2014 Example

So, the syntax of SUBSTRING_INDEX is:

Copy
        
1 SUBSTRING_INDEX(**string**, **delimiter**, **count**);

The three parameters are:

  1. string: The input string or the name of a column.
  2. delimiter: The “index” we’re searching for.
  3. count: A number, which means “before what amount of delimiter?” In other words, “how many of such delimiters should your database management system ignore before returning the result set?”

Here’s another example:

Copy
        
1 SELECT SUBSTRING_INDEX("www.dbvis.com", ".", -2);

This time, the query will produce this result:

Returning a string with a negative number of occurrences of a delimiter
Returning a string with a negative number of occurrences of a delimiter

This example returns a string with a negative number of occurrences of a delimiter. In other words, in the string “www.dbvis.com”, “-1” would return “com” because “com” is after the first dot from the right-hand side. And indeed, “-2” returns the “dbvis.com” string.

It’s also worth noting that if any of the given arguments is NULL, SUBSTRING_INDEX in SQL will also return NULL.

Use Cases of SUBSTRING_INDEX

Now that you understand how SUBSTRING_INDEX works, it’s time to get your hands on some real-world use cases.

Unfortunately, the examples shown above won’t apply to 90% of the use cases in the real world because most real-world use cases will deal with one or more of the following:

  • Columns instead of string values.
  • Bigger data sets.
  • Data analytics.
  • The usage of SUBSTRING_INDEX in SQL in conjunction with other operations like SELECT … INTO OUTFILE, etc.

Not to mention, operations would have to be performed at the same time when other queries are running without disrupting performance.

Example #1

Here’s how SUBSTRING_INDEX would look like in the real world:

Copy
        
1 SELECT 2 ip, 3 SUBSTRING_INDEX(ip, ".", 1) AS ip_p1, 4 SUBSTRING_INDEX(SUBSTRING_INDEX(ip, ".", 2), ".", -1) AS ip_p2, 5 SUBSTRING_INDEX(SUBSTRING_INDEX(ip, ".", -2), ".", 1) AS ip_p3, 6 SUBSTRING_INDEX(ip, ".", -1) AS ip_p4 7 FROM logs;

In this case, we’re obtaining IP addresses in parts. In other words, such an SQL query would make the IP address “127.0.0.1” to be returned in 4 parts named “ip_p1” (IP_part1), “ip_p2”, “ip_p3”, and “ip_p4” with the results being displayed as follows:

IP PartNumber
ip_p1127
ip_p20
ip_p30
ip_p41

Such a SUBSTRING_INDEX in SQL approach may be immensely valuable for companies with geolocation use cases, criminal data breach investigations, etc.

Example #2

Copy
        
1 SELECT SUBSTRING_INDEX(column, "character", -1) FROM table;
SUBSTRING_INDEX in SQL with email addresses - DbVisualizer Example
SUBSTRING_INDEX in SQL with email addresses - DbVisualizer Example

The second use case would be very useful for companies having a lot of users and performing analytics on user data — that may include registration domains (example above), locations, or other data. This query would also help companies analyze data breaches as it can be combined with a COUNT(*) query to count how many users have been registered with a specific domain, etc.

The only downside here is that COUNT(*) in InnoDB would likely be very slow to execute as InnoDB does not contain table metadata inside of itself as its brother MyISAM does.

Things to Keep in Mind

As you can see from the real-world examples above, we’re no longer using the SUBSTRING_INDEX in SQL function with strings — we’re using column values instead. Doing this provides us with an upside in we can analyze data in the entire column at once.

When doing that, it’s wise to keep in mind that the bigger your table is, the slower SUBSTRING_INDEX will take to execute. For some, a reminder of indexes would help — in this case, an index would help speed up SUBSTRING_INDEX in SQL dramatically due to the fact that the function is used together with a SELECT query.

The same works the other way around — if we want to export our result set into a file (e.g. after analyzing a data set we may use the data for other purposes too), we may combine the SUBSTRING_INDEX SQL function with INTO OUTFILE so our SQL function now looks like this:

Copy
        
1 SELECT SUBSTRING_INDEX(column, "@", "-1") **INTO OUTFILE "/tmp/data.txt";**

Here’s a good performance-related example:

Copy
        
1 SELECT SUBSTRING_INDEX(email, "@", -1) AS email, COUNT(*) AS amount FROM users GROUP BY email;

What do you think this SUBSTRING_INDEX query will do? Right — it will select the amount of times an email domain has been provided in the column and provide us with the result.

SUBSTRING_INDEX in SQL’s Brother

SUBSTRING_INDEX in SQL also has a brother, whose name is SUBSTRING. The functionality of SUBSTRING in SQL is quite self-explanatory: it provides us with the ability to return a substring of a string, but without the ability to specify an index. In other words, a SUBSTRING function in SQL would look like this:

Copy
        
1 SELECT SUBSTRING("This is an example", 11);

Such a substring query would return “example” (everything after the eleventh letter.)

One can also use SUBSTRING like this:

Copy
        
1 SELECT SUBSTRING("Value" FROM position);

So, if we execute a query like:

Copy
        
1 SELECT SUBSTRING("This is an example" FROM 11);

We would also get “example.”

RIGHT()

Some database management systems including MySQL Server and SQL Server, also provide a function called RIGHT() that works similarly. For example, a SQL query like so:

Copy
        
1 SELECT RIGHT("This is an example", 7);

Would return “example” too:

RIGHT() in MySQL Server
RIGHT() in MySQL Server

In this case, though, letters are counted backward (i.e., e,l,p,m,a,x,e instead of e,x,a,m,p,l,e—that’s why we use “7” instead of “11.”

Database Management Systems and Their Support for Functions

Not all database management systems support SUBSTRING_INDEX and other aforementioned functions. Find the table denoting the support for functions like SUBSTRING_INDEX in SQL below:

Database Management SystemSupport
OracleSUBSTR
SQL ServerSUBSTR, SUBSTRING[_INDEX], RIGHT
MySQL ServerSUBSTR, SUBSTRING[_INDEX], RIGHT
PostgreSQLSUBSTRING with pattern matching
SQLiteSUBSTR

The table above will help you understand what function to use in which circumstances — as you can see, SUBSTRING_INDEX in SQL is only supported in SQL Server and MySQL, not in other database management systems. Oracle or SQLite doesn’t have the luxury of such a function, as doesn’t PostgreSQL.

With that being said, PostgreSQL does provide support for a lot of data types and, in this case, support for a SUBSTRING function with pattern matching meaning that it can be employed similarly as SUBSTRING_INDEX if used properly.

In PostgreSQL, the SUBSTRING function takes two parameters - a string (column) and a pattern that returns NULL upon no matches. A match is returned once, or if, it's found. Everything works like so:

  • If a POSIX subexpression is provided and the provided pattern contains parentheses, PostgreSQL returns the first parenthesed subexpression.
  • In other cases, the first subset of the text that matches a pattern is returned.

The usage of SUBSTRING in PostgreSQL would look like so:

QueryResult
SELECT SUBSTRING("foobar" FROM "o.b");o[anythingbecauseofthedot]b
SELECT fname, SUBSTRING(fname,1,1) AS first_letter FROM users;Returns the first letter of the first name of the user. e.g. “J” for “Jack”, “D” for “David”, “E” for “Emily”, etc.
SELECT SUBSTRING("The emergency-call number to dial in case of a necessity is 1887", "([0-9]{1,4})") AS emergency_call;Any provided number with digits 0-9 having from 1 to 4 digits. In this case, 1887.

How Will You Use SUBSTRING_INDEX?

As you can see, SUBSTRING_INDEX is a really powerful beast — everything not only depends on how your queries are structured, but your database management system also plays a role.

We hope that this blog helped you understand how to use SUBSTRING_INDEX in SQL and that you saw its power from first-hand examples.

Don’t forget that the power of your database management system lies not only within the functions available for you to use but also in the way you use them — in other words, your SQL client of choice.

We recommend you use DbVisualizer because it’s the #1 choice for many companies across the globe including Tesla, NASA, Volkswagen, and many others. Click here to grab your free trial and try the power of DbVisualizer and unleash the support for many database management systems, the power of the SQL client and advanced query optimization abilities, ERD-like schema generation, and more for yourself.

We’ll see you in the next blog! Bye for now.

FAQ

What is SUBSTRING_INDEX in SQL?

The SUBSTRING_INDEX function helps us to return a part of data as necessary. In other words, it allows us to return a substring of a string after a specified index, hence the name.

When should I use SUBSTRING_INDEX in an SQL query?

SUBSTRING_INDEX should be used whenever you need to return anything after a certain string.

What DBMS supports SUBSTRING_INDEX?

Database management systems supporting SUBSTRING_INDEX include SQL Server and MySQL Server. Other database management systems have similar capabilities.

Dbvis download link img
About the author
LukasVileikisPhoto
Lukas Vileikis
Lukas Vileikis is an ethical hacker and a frequent conference speaker. He runs one of the biggest & fastest data breach search engines in the world - BreachDirectory.com, frequently speaks at conferences and blogs in multiple places including his blog over at lukasvileikis.com.
The Table Icon
Sign up to receive The Table's roundup
More from the table
Title Author Tags Length Published
title

A Guide To the SQL DECLARE Statement

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SERVER 7 min 2024-12-12
title

SQL Upsert: Inserting a Record If It Does Not Exist

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 8 min 2024-12-10
title

A Complete Guide to the MySQL COUNT Function

author Antonello Zanini tags MySQL 9 min 2024-12-09
title

MySQL Nullable Columns: Everything You Need to Know

author Antonello Zanini tags MySQL 6 min 2024-12-03
title

A Complete Guide to the MySQL GROUP_CONCAT Function

author Antonello Zanini tags Data aggregation MySQL 8 min 2024-12-02
title

How and Why to Add an Index in SQL

author Lukas Vileikis tags MySQL POSTGRESQL Search SQL SQL SERVER 8 min 2024-11-28
title

MySQL 9.0 and the New EXPLAIN Features

author Lukas Vileikis tags MySQL SQL 9 min 2024-11-27
title

Massive SQL DELETE Operations and Database Purging Explained

author Lukas Vileikis tags DELETE SQL 4 min 2024-11-26
title

MySQL CAST: A Guide to the Data Conversion Function

author Antonello Zanini tags MySQL 3 min 2024-11-25
title

Postgres Create Array of Numbers: Complete Guide

author Lukas Vileikis tags POSTGRESQL SQL 7 min 2024-11-21

The content provided on dbvis.com/thetable, including but not limited to code and examples, is intended for educational and informational purposes only. We do not make any warranties or representations of any kind. Read more here.

Cookie policy

We use cookies to ensure that we give you the best experience on our website. However you can change your cookie settings at any time in your browser settings. Please find our cookie policy here ↗