POSTGRESQL
REGEXP
SQL

REGEXP_MATCH SQL Function: The Complete PostgreSQL Guide

intro

Learn how to use the REGEXP_MATCH SQL function in PostgreSQL to perform complex string matching with regular expression patterns.

Tools used in the tutorial
Tool Description Link
Dbvisualizer DBVISUALIZER
TOP RATED DATABASE MANAGEMENT TOOL AND SQL CLIENT
PostgreSQL logo POSTGRESQL
THE POSTGRESQL DATABASE

The ability to search for specific strings and extract meaningful data from them is paramount when dealing with data. Regular expressions serve as indispensable tools in this process, providing a flexible and powerful way to define patterns for string matching and manipulation. Here is why PostgreSQL offers several options for that, including the REGEXP_MATCH SQL function.

In this guide, you will delve into REGEXP_MATCH, seeing its syntax, applications, and nuances. Become the master of string matching in PostgreSQL!

What Is the REGEXP_MATCH SQL Function in PostgreSQL?

The REGEXP_MATCH SQL function in PostgreSQL accepts a string and a POSIX regular expression pattern. If you are not familiar with this concept, POSIX regular expressions (a.k.a. “regexps”) follow a standardized pattern-matching syntax to represent sets of strings. In particular, they enable sophisticated string matching based on patterns.

The PostgreSQL REGEXP_MATCH function returns an array containing the substrings corresponding to the first match of a regular expression pattern to the given input string. That is not easy to get at first, so let's better understand how this function works in the next section!

How REGEXP_MATCH Works in Postgres

The REGEXP_MATCH SQL function in PostgreSQL has the following syntax:

Copy
        
1 regexp_match(input_string, posix_pattern [, flags ])

Where the arguments are:

  • input_string: The text expression to apply the regular expression pattern to, starting from the beginning of the string.
  • posix_pattern: The POSIX regular expression pattern represented as a string.
  • flags: An optional text string containing one or more single-letter flags that change the function's behavior (e.g. 'i' enables case-insensitive matching). Find out all flags available in the documentation.

As mentioned before, REGEXP_MATCH searches posix_pattern in input_string and returns a text array containing the substrings corresponding to the first match. Take a look at the example below:

Copy
        
1 SELECT regexp_match('transportation', 'p.*rt');

This will produce:

Copy
        
1 {'port'}

The reason is that the pattern 'p.*rt', for words like “port,” “part,” and others, matches the substring “port” in the input string “transportation.”

Specifically, the PostgreSQL REGEXP_MATCH function returns: Option #1: NULL, if there is no match:

Copy
        
1 SELECT regexp_match('The "Air-Max" sneakers are simply the best shoes on the market', '[0-9.]+'); 2 -- NULL

Option #2: A single-element text array containing the substring matching the whole pattern, if a match is found and the contains no parenthesized sub-expressions:

Copy
        
1 SELECT regexp_match('The price of the "Air-Max" sneakers is $499.99', '[0-9.]+'); 2 -- {'499.99'}

Option #3: A text array with each element corresponding to a sub-expression match, if a match is found and the pattern contains more than one parenthesized sub-expressions:

Copy
        
1 SELECT regexp_match('The price of the "Air-Max" sneakers is $499.99', '"(.*)\\-(.*)"'); 2 -- {'Air', 'Max'}

To see what results a regular expression produces and why, you can test it on websites like regex101 before using it in your query:

https://paper-attachments.dropboxusercontent.com/s_612A0556B646DAF6346D80FA11E6D3620BB4896E9CEDF3FDC0AEA0C69A8D094A_1706000488293_image.png
Regular expression test on website

Note: If you want to get the set of text arrays containing all matching substrings and not only the array with the first match, use the PostgreSQL REGEXP_MATCHES function instead.

REGEXP_MATCH SQL Function: Use Cases

The REGEXP_MATCH SQL function offered by PostgreSQL is a powerful tool for performing advanced string matching and extraction in your queries. No wonder, the scenarios where this function can come to the rescue are countless.

Here, you will see the three most popular REGEXP_MATCH use cases.

The following queries will be executed in DbVisualizer, the PostgreSQL database client with the highest user satisfaction on the market!

Prefix Phone Number Extraction

Suppose you have some text that may contain a phone number. The goal is to retrieve it from the text. This is a common scenario that generally requires application-level logic. Thanks to the REGEXP_MATCH PostgreSQL function, you can achieve that with this POSIX regexp for North American phone numbers:

Copy
        
1 ([0-9]{3})[-.●]?[0-9]{3}[-.●]?[0-9]{4}

Use the above regular expression to get the phone number in an array as below:

Copy
        
1 SELECT regexp_match('Contact us at 987-654-3210', '[0-9]{3}[-.●]?[0-9]{3}[-.●]?[0-9]{4}') AS phone_number;

The result will be:

Copy
        
1 {'987-654-3210'}
The result array contains the phone number as expected
The result array contains the phone number as expected

Great! That is exactly the phone number you were looking for!

Now, what if the text contained more than one phone number? You might consider setting the 'g' flag to look for all matches and not just the first one, as follows:

Copy
        
1 SELECT regexp_match('Contact us at 987-654-3210 or 555-872-2310', '[0-9]{3}[-.●]?[0-9]{3}[-.●]?[0-9]{4}', 'g') AS phone_numbers;

In this case, you will get this error:

Copy
        
1 [Code: 0, SQL State: 22023] ERROR: regexp_match() does not support the "global" option. Hint: Use the regexp_matches function instead.
DbVisualizer highlights the error message in red
DbVisualizer highlights the error message in red

The reason is that the REGEXP_MATCH SQL function does not support the 'g' (“global”) option. Instead, you need to use the PostgreSQL REGEXP_MATCH function:

Copy
        
1 SELECT regexp_matches('Contact us at 987-654-3210 or 555-872-2310', '[0-9]{3}[-.●]?[0-9]{3}[-.●]?[0-9]{4}', 'g') AS phone_numbers;

This time, you will get both phone numbers as desired:

Note the multiple results
Note the multiple results

Email Retrieval

E-mail addresses are valuable information that you want to distinguish from other text information. To extract them, you can use a regexp with REGEXP_MATCH as below:

Copy
        
1 SELECT regexp_match('Contact us at info@company.com', '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}') AS email;

This will return:

Copy
        
1 {'info@company.com'}

Keep in mind that you generally want the data to be returned as a string and not as a text array as it is now. So, you usually have to get the first element of the PostgreSQL array with the [1] syntax:

Copy
        
1 SELECT regexp_match('Contact us at info@company.com', '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}')[1] AS email;

This time, you will get:

Copy
        
1 'info@company.com'

![Note that wrapping round parenthesis to apply "1]" correctly

Note that wrapping round parenthesis to apply "[1]" correctly

Note that the above query is equivalent to:

Copy
        
1 SELECT regexp_substr('Contact us at info@company.com', '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}') AS email;

The problem with the REGEXP_SUBSTR function is that it is only available from PostgreSQL 15, while REGEXP_MATCH is available from PostgreSQL 10. Here is why most users still use REGEXP_MATCH.

URL Extraction

To extract the URL from a string in PostgreSQL, use REGEXP_MATCH with the following regexp:

Copy
        
1 http[s]?:\\/\\/[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}

Apply it as in this example:

Copy
        
1 SELECT regexp_match('Visit our website at <HTTPS://EXAMPLE.COM> for more information', 'http[s]?:\\/\\/[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}' , 'i') AS url;
Note the resulting URL in upper case
Note the resulting URL in upper case

Notice the 'i' flag for case-insensitive regular expression matching.

Awesome! You are now a PostgreSQL REGEXP_MATCH expert!

PostgreSQL REGEXP_MATCH: Best Practices

This is a list of best practices for using the REGEXP_MATCH like a pro:

  • Before you spend a lot of time creating an effective regexp, look for it online. You will find working regular expressions for common scenarios.
  • Do not try to use the 'g' flag but use REGEXP_MATCHES instead.
  • For simple scenarios where you want the result of the regular expression search or NULL and you are on PostgreSQL 15 or up, consider REGEXP_SUBSTR instead.
  • Test your regular expressions on sites like regex101 before using them in your query.
  • Specify the 'i' flag for case-insensitive matching.

Conclusion

In this guide, you understood what the REGEXP_MATCH function is in PostgreSQL and how it works. You now know that it is a powerful tool to apply a regular expression for string matching. Thanks to the use cases shown here, you have also learned how to use the function in real-world scenarios.

REGEXP_MATCH is a complex function that requires a powerful database client. Which is the best based on user satisfaction? DbVisualizer! This full-featured database client supports several DBMS technologies, offers advanced query optimization capabilities, and can generate ERD-like schemas. Try DbVisualizer for free today!

FAQ

What is a POSIX regular expression?

A POSIX regular expression, also known as “regexp,” is a standardized pattern-matching syntax used in computing to search for a set of strings. It empowers advanced text searching and manipulation by supporting rules for matching characters or character sequences within strings. That means POSIX regexps are a powerful tool for tasks such as string matching, extraction, and validation.

What databases support the REGEXP_MATCH SQL function?

Among popular databases, the REGEXP_MATCH SQL function is supported only by PostgreSQL and SingleStoreDB. Since it is not part of the standard SQL specification, each database technology may implement it differently.

What is the difference between the PostgreSQL REGEXP_MATCHES and REGEXP_MATCH functions?

The main difference between the PostgreSQL REGEXP_MATCHES and REGEXP_MATCH functions lies in their output. REGEXP_MATCHES returns a set of text arrays, providing all matches of a regular expression pattern in a string. In contrast, REGEXP_MATCH returns a single-element text array for the first match, or NULL if there is no match.

What is the difference between the PostgreSQL REGEXP_MATCH and REGEXP_LIKE functions?

The differences between the PostgreSQL REGEXP_MATCH and REGEXP_LIKE functions are in their purpose and syntax. REGEXP_MATCH returns a text array of matching substrings within the first match of a regular expression pattern, allowing extraction of specific substrings. On the other hand, REGEXP_LIKE is a boolean function that checks if a string matches a specified regular expression pattern, returning true or false.

Should you use the REGEXP_SUBSTR function or the REGEXP_MATCH SQL function in PostgreSQL?

The choice between REGEXP_SUBSTR and REGEXP_MATCH in PostgreSQL depends on the desired outcome. If you need the whole matching substring or NULL for no match, REGEXP_SUBSTR is preferable. However, it is only available from PostgreSQL version 15 onwards. In older versions, you can get the same result by accessing the first element of REGEXP_MATCH's result array.

Dbvis download link img
About the author
Antonello Zanini

Antonello is a software engineer, and often refers to himself as a technology bishop. His mission is to spread knowledge through writing.

The Table Icon
Sign up to receive The Table's roundup
More from the table
Title Author Tags Length Published
title

A Guide To the SQL DECLARE Statement

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SERVER 7 min 2024-12-12
title

SQL Upsert: Inserting a Record If It Does Not Exist

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 8 min 2024-12-10
title

How to Find Elements In an Array: PostgreSQL Guide

author Antonello Zanini tags POSTGRESQL 3 min 2024-12-05
title

How and Why to Add an Index in SQL

author Lukas Vileikis tags MySQL POSTGRESQL Search SQL SQL SERVER 8 min 2024-11-28
title

MySQL 9.0 and the New EXPLAIN Features

author Lukas Vileikis tags MySQL SQL 9 min 2024-11-27
title

Massive SQL DELETE Operations and Database Purging Explained

author Lukas Vileikis tags DELETE SQL 4 min 2024-11-26
title

Postgres Create Array of Numbers: Complete Guide

author Lukas Vileikis tags POSTGRESQL SQL 7 min 2024-11-21
title

List of SQL Reserved Words in All Major Databases

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 16 min 2024-11-20
title

A Beginner’s Guide to SQL & Relational Database Management Systems

author Leslie S. Gyamfi tags BEGINNER SQL 5 min 2024-11-19
title

SQL Variable: What It Is and How To Use It

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 8 min 2024-11-18

The content provided on dbvis.com/thetable, including but not limited to code and examples, is intended for educational and informational purposes only. We do not make any warranties or representations of any kind. Read more here.

Cookie policy

We use cookies to ensure that we give you the best experience on our website. However you can change your cookie settings at any time in your browser settings. Please find our cookie policy here ↗