intro
Let’s learn everything you need to know about the new VECTOR data type and vector capabilities introduced in SQL Server 2025.
Vector data is becoming increasingly important with the rise of embeddings in AI and machine learning. To support this, SQL Server 2025 (preview released in mid-2025) introduces a new data type: the VECTOR data type. Enter the SQL Server VECTOR world!
In this guide, you will explore the VECTOR data type in SQL Server 2025, including its search and indexing mechanisms.
What Is the VECTOR Data Type in SQL Server?
In SQL Server 2025 (17.x) Preview and higher, the VECTOR data type is designed to store vector data. In detail, this new data type supports operations such as similarity search directly within the database for the development of machine learning and AI applications. Learn more in our guide on vector databases.
Note that SQL Server VECTOR data type stores vectors in an optimized binary format, where each element is represented as a 4-byte (single-precision) floating-point number. Still, to make the experience more familiar for developers, vectors are created and displayed as JSON arrays.
For example, a four-dimensional vector can be represented as:
1
'[8, 2, 0.7, -31]'
How to Use the MS SQL VECTOR Data Type
The syntax for using the VECTOR type in a table column definition is:
1
column_name VECTOR({<dimensions>}) [NOT NULL | NULL]
Here, the dimensions argument is required and defines the number of elements in the vector you want to store in that column. It must be an integer between 1 and 1998.
For example, you can define a two-dimensional vector column in a table with:
1
CREATE TABLE Vectors
2
(
3
id INT PRIMARY KEY,
4
vector_data VECTOR(2) NOT NULL -- where to store your two-dimensional vector data
5
);
You can insert data into the Vectors table like this:
1
INSERT INTO Vectors(id, vector_data) VALUES
2
(1, '[8.0, 4]'),
3
(2, '[-12, 0.523]');
Note that the vectors must have exactly two elements to match the dimension.
Equivalently, you can also use the JSON_ARRAY function to create vector values:
1
INSERT INTO Vectors(id, vector_data) VALUES
2
(1, JSON_ARRAY(8.0, 4])),
3
(2, JSON_ARRAY(-12, 0.523));
Notes
1
SELECT
2
CAST('[8, -2, 3.7]' AS VECTOR(3)) AS vector_1,
3
CAST(JSON_ARRAY(5.0, 2, 9, 10, -15.6) AS VECTOR(5)) AS vector_2;
Limitations
Vector Indexing in SQL Server
In the realm of SQL Server vectors, the term “index” differs from traditional relational database indexes. In particular, A “vector index” is designed to return approximate results, not exact matches. Its purpose is to improve the performance of nearest neighbor searches in high-dimensional datasets.
In SQL Server, vector indexes are built using the DiskANN algorithm. DiskANN creates a navigable graph structure over the vector space, enabling fast and memory-efficient traversal to find vectors closest to a query vector. This makes it ideal for large-scale AI and machine learning workloads with limited computational resources.
Remember that you can create SQL Server vector indexes using the custom CREATE VECTOR INDEX syntax. That is different from the traditional SQL CREATE INDEX syntax.
Vector Search in SQL Server
Vector search is the process of finding vectors in a dataset that are similar to a given query vector. SQL Server supports both exact and approximate vector search. So, let’s explore both approaches!
Exact Search (k-NN and ENN)
Exact search, also known as the k-nearest neighbor (k-NN) search, involves calculating the distance between a query vector and every vector in a dataset, then sorting these distances to find the closest neighbors based on a specified distance metric.
More specifically, exact nearest neighbor (ENN) search performs an exhaustive comparison across all vectors to guarantee retrieval of the closest matches. This method is highly accurate, but it is resource-intensive and best suited for smaller datasets or cases where accuracy is fundamental.
To measure the similarity or closeness between vectors, distance functions are used. In SQL Server, this is possible thanks to the VECTOR_DISTANCE function. As of this writing, the supported distance metrics are:
For example, you can retrieve the top 5 most similar vectors to a given input vector by creating vector embeddings from text using an AI model and return rows having the smallest distance (the biggest similarity):
1
/*!
2
Creating a list of numbers (a vector) named "vector_input", then
3
using an AI function to turn "Frank Sinatra music style" into a numerical representation using an AI model called Ada2Embeddings.
4
*/
5
DECLARE @vector_input VECTOR(1536) = AI_GENERATE_EMBEDDINGs(N'Frank Sinatra music style' USE MODEL Ada2Embeddings);
6
7
/*! Performing a search for the first 5 vectors similar to our input vector */
8
9
SELECT TOP(5)
10
id,
11
VECTOR_DISTANCE('cosine', @vector_input, [content_vector]) AS distance,
12
title
13
FROM WikipediaPageEmbeddings
14
ORDER BY distance;
This query declares a 1536-dimensional vector @vector_input generated by the built-in function AI_GENERATE_EMBEDDINGS. This function creates vector embeddings from the text 'Frank Sinatra music style' using a pre-trained AI model called Ada2Embeddings stored in the database. The query then calculates the cosine distance between this query vector and each vector stored in the content_vector column of the wikipedia_articles_embeddings table. It returns the top 5 rows with the smallest distances—i.e., the most similar Wikipedia pages.
Approximate Search (ANN)
Finding similar vectors using exact search (k-NN) is resource-intensive and does not scale well due to the fact that a database has to compute the distance between the query vector and every vector in the database which can be resource-intensive. Approximate nearest neighbors (ANN) represent a faster alternative to k-NN, trading a small amount of accuracy (recall) for speed. In AI applications, this trade-off is often acceptable and delivers near-identical results with much better performance.
In SQL Server, ANN search is supported through the VECTOR_SEARCH function. This function finds vectors similar to a given query vector using the approximate nearest neighbor algorithm. As of this writing, the supported distance metrics are 'cosine', 'dot', and 'euclidean'. These metrics determine how similarity is measured between the query vector and vectors in the target column.
For instance, this is how you can perform ANN with VECTOR_SEARCH in SQL Server:
1
/*!
2
Creating a list of numbers (a vector) named "vector_input", then
3
using an AI function to turn "Frank Sinatra music style" into a numerical representation using an AI model called Ada2Embeddings.
4
*/
5
DECLARE @vector_input VECTOR(1536) = AI_GENERATE_EMBEDDING(N'Frank Sinatra music style' USE MODEL Ada2Embeddings);
6
7
/*!
8
Performing a search to find songs similar to our input vector (above.)
9
The results are labeled as "VS"
10
*/
11
12
SELECT A.id, A.title, VS.distance
13
FROM
14
VECTOR_SEARCH(
15
TABLE = AudioTracks AS A,
16
COLUMN = embedding_vector,
17
SIMILAR_TO = @vector_input,
18
METRIC = 'cosine',
19
TOP_N = 10
20
) AS VS
21
WHERE A.is_published = 1
22
ORDER BY VS.distance;
The VECTOR_SEARCH function returns the top N (10, in this case) vectors most similar to the input vector, based on the specified metric (e.g., 'cosine'). After the similar vectors are retrieved, additional filters (e.g., WHERE is_published = 1) are applied. If none of the top N rows match the predicate, the final result comes out empty.
Note that ANN indexing is used only if a compatible ANN index exists on the specified column using the same distance metric. If no such index is found, SQL Server falls back to a full k-NN scan and displays a warning.
Conclusion
In this blog post, you explored the world of vector support in SQL Server, with a special focus on the new search and indexing capabilities introduced in the Microsoft database engine. These advancements open the door to a wide range of AI and machine learning applications, directly within the database environment.
No matter what type of AI-driven application you are building, you will need a reliable database client, just like DbVisualizer.

DbVisualizer is a top-rated SQL Server database client that also supports 50+ other databases. Its premium features include query optimization, drag-and-drop query building, ERD-style schema generation, streamlined import/export, and more. Download it for free today!
FAQ
When were vector features introduced in SQL Server?
Vector features, including a native VECTOR data type, vector search capabilities, and vector indexes, have been introduced in SQL Server 2025 (17.x) Preview. This build was released in mid-2025 and builds on previous versions to enhance SQL Server as a platform that supports a variety of development languages, data types, deployment environments (on-premises or cloud), and operating systems.
Does SQL Server support vector data storage?
Yes, as of SQL Server 2025, you can store vector data directly in your database instance using the new SQL Server VECTOR data type.
Can SQL Server store embeddings?
Yes, starting with SQL Server 2025, you can store embeddings directly using the new VECTOR data type. Keep in mind that Azure OpenAI provides models that can generate embeddings from text data.
What are the main vector functions in SQL Server?
The main vector functions in SQL Server are:
| Function | Description |
|---|---|
VECTOR_DISTANCE | Calculates the distance between two vectors using a specified distance metric. |
VECTOR_SEARCH | Returns the closest vectors to a given query vector based on a chosen distance metric, using an approximate nearest neighbor algorithm. |
VECTOR_NORM | Returns the norm (length or magnitude) of a vector, calculated according to the specified norm type. |
VECTOR_NORMALIZE | Returns a normalized version of the input vector, scaled to have a length of 1 based on the specified norm type. |
VECTORPROPERTY | Retrieves specific properties of a given vector. |
What is the main requirement for using vector features in SQL Server applications?
The main requirement is using TDS version 7.4 or higher with updated SQL Server drivers. These drivers support efficient binary transmission of vector data, enabling native read/write operations, reduced payload size, and improved performance and precision.
Do PostgreSQL and MySQL support the VECTOR data type?
PostgreSQL supports it through the pgvector extension, while MySQL introduced native support for VECTOR data types starting from version 9.

