POSTGRESQL
Vectors

A Beginner's Guide to Vector Search Using pgvector

intro

In this blog, we’ll explore the PostgreSQL vector search capability offered by pgvector. Join the journey!

Tools used in the tutorial
Tool Description Link

These days, vectors are everywhere, and we’re not only talking about AI-enhanced vector capabilities but about vectors in general. Vector databases are a thing too — but what to do when you need to search through vector-based data without using them?

MariaDB has MariaDB Vector — PostgreSQL has pgvector!

What is pgvector?

pgvector is the PostgreSQL answer to vector search: in essence, it is a Postgres extension allowing us to query, store, and index vector data. Since PostgreSQL doesn’t come with native vector capabilities, the team behind the database has decided to introduce vector similarity search as an extension and introduced it starting from PostgreSQL 11.

How and When to Use pgvector?

These are the prerequisites to use pgvector in PostgreSQL:

Once pgvector is installed enable it by creating the extension like so:

Copy
        
1 CREATE EXTENSION pgvector;

If the query is successful (it acts on the pgvector you’ve installed previously), you should be good to go.

To use pgvector, you now need:

  1. A table bearing the vector data type.
  2. Text to search through.
  3. Embeddings (vector similarity searches for vector embeddings)

An image by OpenAI nicely answers many questions related to the points above:

Using the Embedding Model to Turn Text Into a Vector Embedding. Image by OpenAI
Using the Embedding Model to Turn Text Into a Vector Embedding. Image by OpenAI

We need text to act on (text is stored as part of the values in the database), and since we need a vector embedding, to obtain it we need to run the text through an embedding model. The result is as clear as water — we obtain a vector embedding that then gets stored in our database.

We obtain a vector embedding because an embedding is a necessity to make vector similarity search tick — pgvector enables us to query vector embeddings, but before we query them, we must store them inside our database: text embeddings can be generated with tools like Word2Vec or even OpenAI, as can other types of embeddings (we can have document, graph, or image embeddings too.)

pgvector has a couple of prerequisites:

  1. A WITH query that fetches an embeddings vector. The query should select the necessary data from the tables in question.
  2. A SELECT query that makes use of the WITH query definition.

An example is given below — first, we fetch a vector by defining the pgvec_example function and then, we select data from the vector using the function we’ve defined using a similarity search:

Fetching and making use of an embeddings vector in PostgreSQL
Fetching and making use of an embeddings vector in PostgreSQL

Keep in mind that similarity is defined by using the <-> string operator.

Such queries wouldn’t be ran standalone — they would usually be implemented into Python (or other kind of) scripts so that they could be invoked like so:

Copy
        
1 python3 script.py

Such a functionality allows developers to combine the power of pgvector with the capabilities of their programming language of choice. Vector similarity search will find documents that are similar to your search query as well as exact matches, hence the similarity search option, or <->.

Summary

pgvector is PostgreSQL’s answer to vector similarity search. In detail, that is a PostgreSQL extension allowing us to quickly and easily implement vector-like search inside PostgreSQL since vectors aren’t available by default.

FAQ

What is pgvector?

pgvector is an extension by Postgres. It allows us to implement vector similarity searches in our database.

How to use pgvector?

To use pgvector, enable the extension, generate vector embeddings using OpenAI or other tools, store them in a database, and act on them using SQL queries that can also be part of Python or other kinds of scripts.

Where can I learn more about database extensions?

To learn more about database extensions, follow our blog, as well as keep yourself up to date by reading a book or two Hacking MySQL is a good place to start for those looking into a relational database.

Dbvis download link img

About the author
LukasVileikisPhoto
Lukas Vileikis
Lukas Vileikis is an ethical hacker and a frequent conference speaker. He runs one of the biggest & fastest data breach search engines in the world - BreachDirectory.com, frequently speaks at conferences and blogs in multiple places including his blog over at lukasvileikis.com.
The Table Icon
Sign up to receive The Table's roundup
More from the table
Title Author Tags Length Published
title

CREATE DATABASE in PostgreSQL: A Complete Guide

author Antonello Zanini tags POSTGRESQL 6 min 2025-04-30
title

How Often Should SQL Transaction Logs Be Backed Up?

author Antonello Zanini tags MySQL POSTGRESQL SQL SQL SERVER 7 min 2025-04-28
title

TRUNCATE vs DELETE: SQL Comparison of the Two Statements

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 9 min 2025-04-23
title

How to Extract the Year from a Date in SQL

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 5 min 2025-04-21
title

How To List Tables In Postgres: Complete Guide

author Leslie S. Gyamfi tags POSTGRESQL 6 min 2025-04-17
title

A Guide to the SQL Standard Deviation Functions

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 6 min 2025-04-15
title

SQL EXISTS: Syntax and Use Cases with Examples

author Leslie S. Gyamfi tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 6 min 2025-04-14
title

SQL TRUNCATE TABLE: A Complete Guide

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 6 min 2025-03-26
title

A Complete Guide to the FULL OUTER JOIN SQL Operation

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 7 min 2025-03-24
title

How to Use JOIN in a DELETE Query in SQL

author Antonello Zanini tags MySQL ORACLE POSTGRESQL SQL SQL SERVER 8 min 2025-03-20

The content provided on dbvis.com/thetable, including but not limited to code and examples, is intended for educational and informational purposes only. We do not make any warranties or representations of any kind. Read more here.