Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:02):
Today we'll look at the AIintegration developer updates
and performance improvements
that make SQL Server 2025 a major upgrade.
We've got a lot to unpack here,
so we're going to waste notime and get straight into this
with lead SQL engineer, Bob Ward.
Welcome back to the show.
- So great to be back.
- So SQL Server 2025, it's brand new.
It's in public preview right now.
So what's behind therelease and what's new?
- There are three major areas of updates
(00:25):
that we focus on in this release.
First, we have deep AI integration.
For example, we now havebuilt-in vector search support
for more accurate andefficient data retrieval
with some under the hoodoptimizations using DiskANN.
Second, if you're a developer,
this is the mostsignificant release of SQL
in the last decade.
You know, some of the highlights
are native support for JSON files
(00:46):
and new change eventstreaming capabilities
for real-time updates.
And the third area is improved analytics,
where we're going to make it easy
to mirror your SQL Serversinto Microsoft Fabric
without moving the data.
- And all of these arevery significant updates.
So why don't we start withwhat's new in search and AI?
- Great, let's get going.
As I've mentioned, we've integrated AI
directly into the database engine
(01:06):
to give you smarter,intelligent searching.
With vector search capabilities built-in,
you can do semantic search over your data
to find matches based onsimilarity versus keywords.
For example, here I have a database
with a table called ProductDescription,
and I want to search using SQL queries
against the Descriptiontable for intelligent search.
(01:27):
Typically, you'd use fulltext search for this.
Now I've built this out,
but what about thesenatural language phrases,
Will they work?
They don't.
And even when I use likeclauses, as you can see here,
or contains, or even freetext,
none of these methodsreturns what I'm looking for.
Instead, this is where natural language
(01:48):
with vector search inSQL Server 2025 shines.
As a developer, I can get started
even locally on mylaptop, no GPU required.
I'm using the popular framework, Ollama,
to host a free open-sourceembeddings model
from Hugging Face.
This will convert our data into vectors,
including query prompts,
and I declare it using thisCREATE EXTERNAL MODEL statement.
(02:10):
Then I'm able to go in and build a table
using the new built-in vector type
to store what's calledembeddings in a binary format.
My table has keys pointingback to my description data
and then I can use abuilt-in T-SQL function
to generate embeddings basedon Ollama and store them.
For vector search to work, Ineed to create a vector index,
(02:32):
and it's also performance optimized
using Disk approximatenearest neighbor, or DiskANN,
which is a new way to offload
what you'd normally wantto run completely in memory
to point to an index stored on disk.
I have a stored procedure
to convert the queryprompts into embeddings
so it can be used tofind matching embeddings
in the vector index.
(02:53):
So now I have everythingrunning locally on my laptop
running SQL.
Let's see how it works.
I'll try this natural language prompt,
like I showed earlier.
And it worked.
I get a rich set of results,
with matching informationbased on my search
to find products in the database.
And I can even use Copilot from here
to explore more about SQL data.
(03:14):
I'll prompt it to look for my new table.
And you can see, responsehere, finding our new table.
And I can ask it to pullup a few embedding values
with product names and descriptions.
And as you saw the result usingour open source embeddings
returned a few languages back.
And the good news
is that if your datacontains multiple languages,
it's easy to use differentembedding models.
(03:36):
For example,
here I've wired up AzureOpenAI's ADA 2 embeddings model
optimized for multiple languages
without even changing my code.
And now I can even searchusing Mandarin Chinese
and get back matching results.
- And DiskANN and vector-based search
are both massive updatesthat really go hand in hand
to enable better natural language querying
on modest hardware.
So what about all the developer updates?
(03:57):
- With these updates,
things get so much moreefficient for developers.
With JSON file types,
you can bring NoSQL intoyour SQL relational database.
Let me show you how.
I've created a database called Orders
and a table called Orders.
Notice here the new JSON data type,
which can store up toa massive two gigabytes
of JSON document in this native data type.
(04:18):
Now let's look at a couple of examples.
First, I can easily insert JSON documents
in their native formatdirectly into the table,
and I'll show you some of theJSON functions that you can do
to process this new JSON type.
JSON value will pull a particular value
out of a JSON document
and bring it back in result set format.
(04:38):
And I can just dump outall the JSON values,
so each document willappear as a separate row
in their native JSON format.
But instead of just doing that,I have aggregate functions.
This takes all the rowsof JSON types in the table
and produces a single arraywith a single JSON document
with all the new rowsin the native JSON type.
(05:00):
Key-value pairs are also popular in JSON,
and I can use the newOBJECT AGGREGATE function
to take the order ID keyand the JSON document
and produce a set of key-value pairs.
And I can modify the JSONtype directly from here too.
Notice, for order_id 1,the quantity is also 1.
I'll run this update to modify the value.
(05:21):
And when it's finished,the order_id, quantity
has been updated with the valueof 2 directly in the JSON.
Now that's a good exampleof using the JSON type.
So let me show you how thisworks with a JSON index.
I've got a differentdatabase for contacts,
along with the table for contacts
using a JSON documentas one of the properties
(05:41):
of the contacts table.
I can create a JSON index ontop of that JSON document,
like this.
Now I've got some sampledata that are JSON documents.
And in a second, I'm going topush those into our database.
And as I scroll,
you'll that this hasnested tags as properties
in the JSON document.
Now I'll run the query soI can insert these rows
(06:03):
with the names of each tag.
Let's go look at the output.
I'm using JSON value for the name,
but I'm using JSON querybecause the tags are nested.
Now I'll show you an examplesearching with the JSON index.
I'm using the new JSON contains function
to find tags called fitness
that are deep nested in the JSON document.
And I can run that and find the right tags
(06:26):
and even the execution plan.
You can see here that it shows
we're using the new JSON index
to help go find that information.
- That's a big deal.
And like you said, there's alot happening natively in JSON,
and now you've got thebenefits of SQL for joins,
and security, and a lot more,
- You know, and for developerswho use change data capture,
things become a lot easierwith change event streaming.
(06:47):
Here, we're reducing I/O overhead
and sending transaction log changes
directly to your application.
To get started with change event streaming
for our orders database,
I'll run the stored procedure
to enable streaming for the database.
You can see the table we're going to use
to track changes is atypical type of orders table.
Here I've created what'scalled an event stream group.
(07:08):
This is where I'veconfigured event streaming
to tell it the locationof our Azure event hub
to stream our data, andI've added my credentials.
Then I've configured the table orders
to be part of the event streaming group.
I've run these procedures
to make sure that myconfiguration is correct.
So let's do something interesting.
I'm going to automatea workflow using agents
(07:29):
to listen for changes as they come in
and try to resolve any issues.
First, I've created an Azure function app,
and using my function app,
I have an agent runningin the Azure AI service
called ContosoShippingAgent.
It's built to take shipmentinformation, analyze it,
and decide whether somethingcan be done to help.
For example, resolving a shipping delay.
(07:50):
I've started my Azure function.
This function is waiting for events
to be sent to Azure EventHub in order to process them.
Now, in SQL, I'll insert a new order.
Going back over to my Azure function,
you'll see how the event is processed.
In the code, first, we'redumping up the raw cloud event
that I showed earlier.
(08:11):
Notice the operation is an insert.
It's going to dump out
some of the different fieldswe've parsed out of the data,
the column names, the metadata,and then the row itself.
Notice that becausethe shipment is 75 days
greater than our sales date,it will call our agent.
The agent then comes back with a response.
It looked at the tracking details
and determined that it canchange the shipping provider
(08:33):
to expedite our delayed shipment,
and it contacted the customer directly
with the updating shipping info.
- And everybody likes faster shipping.
So speaking of thingsthat are getting faster,
it's kind of a tradition on Mechanics
that we cover the speedups for SQL Server.
So what are the speed ups
and the performance optimizations for '25?
- Well, there's a lot,
but my favorite one improvesapplication concurrency.
We've improved the internals
(08:54):
of how locking works withoutapplication code changes.
And I've got an example of this running.
I have a lock escalationproblem that I need to resolve.
I'm going to go update about2,500 rows in this table
just to show what happens,then how we've solved for it.
So running this query againstthat Dynamic Management View,
or DMV, shows locks that have accumulated,
(09:15):
about 2,500 locks here for key-value locks
and 111 for page locks.
So what happens if I run enoughupdates against the table
that would cause a lock escalation?
Here, I'll update 10,000rows in the system.
But you can see with the locks
that this has beenescalated to an object lock.
It's not updating the entire table,
but it's going to cause a problem.
(09:36):
Because I've got a query over here
that can update the maximumvalue in just one row
and it's going to get blocked,but it shouldn't have to be.
You can see here from theblocking query that's running
that it's blocked onthat original session,
and I'm not actually updating a row
that's affected by the first one.
This is the problem with lock escalation.
Now let's look at a new optioncalled optimized locking
(09:57):
in SQL Server 2025.
Okay, let's go back towhere I updated 10,000 rows
and look at the lock.
Notice how in this particular case
I have a transaction lock.
It's an intent exclusivelock for the table,
but only a transactionlock for that update.
If I use this query to update the max,
you'll see that we are not blocked.
And by looking at the locks,
(10:18):
each item has specific transaction locks,
so we're not blocking each other.
And related to this, we'vealso solved another problem
where two unrelatedupdates can get blocked.
We call this lock after qualification.
- Okay, so it's pinpointingthe exact lock type,
so you'll get less locks in the end.
So why don't we move onthough from locks to joins?
- Sure.
With Microsoft Fabric, it's amazing.
(10:38):
You can pull in multiple databases,
multiple data types intoa unified data platform.
Imagine you have two different SQL Servers
in different clouds and on-prem,
and you just want to join thisdata together in an easy way
without migrating it.
With Fabric, you can.
I have a SQL Server 2022instance with a database,
and we've already mirroredthe product tables
(11:00):
from that database into Fabric.
I'll show you the mirroringconfiguration process
for a SQL Server 2025instance with different,
but related tables.
These are similar to the steps
from mirroring any SQL Server.
I've created a databaseconnection for SQL Server 2025.
Now I'll pick all the tablesin our database and connect.
I'll leave the name as is, AdventureWorks,
(11:22):
and we're ready to mirror our database.
You can see now that thereplication process has started
for all the tables.
All the rows have been replicated
for all the columns on allthe tables in my database
and they've been mirrored into Fabric.
Now let's query the data usingthe SQL analytic endpoint.
And you can see that the tables
that we have previously hadin our database and SQL Server
(11:45):
are now mirrored into OneLake.
Let's run a query and I'lluse Copilot to do that.
Here's the Copilot code with explanations.
Now I'll run it.
And as it completes, there'sour top customers buy sales.
Now what if we wanted to do a join
across the other SQL server?
It's possible.
But normally, there are a lotof manual pieces to do this.
(12:07):
Fabric can make thateasier using a lakehouse.
So let's create a new lakehouse.
I just didn't to give ita name, AdventureWorks,
and confirm.
Now notice there are notables in this lakehouse yet,
so let's add some.
And for that, I'll use a shortcut.
A shortcut uses items in OneLake,
like the SQL Serverdatabases we just mirrored.
(12:27):
So I'll add the AdventureWorks database.
And scrolling down, I'llpick all the tables I want.
Now I'll create it.
And we're not storing the dataseparately in the lakehouse.
It's just a shortcut,
like an active readlink to the source data,
which is our mirrored database,
and therefore something thatalready exists in OneLake.
And now you can see I'vegot these objects here.
(12:49):
This icon means that these areshortcut from another table.
So now, let's get datafrom another warehouse.
The SQL Server 2022 instance,which was ADW_products.
Again, here, I'll pick thetables that I want and Create.
That's it.
So I can go and look at product
to make sure I've got my product data.
(13:12):
Now, let's try to querythis as one database
and use another analytic endpoint
directly against the lakehouse itself.
So basically it thinks all the tables
are just part of the unified schema now.
Let's open up Copilot
and write a prompt topull my top customers
by products and sales.
And it will be able to work
directly against all ofthese connected databases
(13:34):
because they are in just the same schema.
And there you go.
I have a list of all the data I need
in one logical database.
- And this is really great.
And I know now thateverything's in OneLake,
there's also a lot more thatyou can do with that data.
- With the lakehouse, the sky's the limit.
You can use Power BI,or any of those services
that are in the unified dataplatform, Microsoft Fabric.
- Okay, so now we've seen all the updates
(13:55):
with SQL Server 2025.
To everyone watching,
what's the best thing theycan do to get started?
- Well, the first thingis to start using it.
SQL Server 2025 is readyfor you to download
and install it on theplatform of your choice.
You'll find it at aka.ms/GetSQLServer2025.
- So thanks so much forsharing all the updates, Bob,
and thank you for joining us today.
(14:17):
Be sure to subscribe for more,and we'll see again soon.