I’m trying to extract information with SQL from a XML column which looks like this:
<?xml version="1.0" encoding="utf-16"?>
<Advertentie>
<Filter>
<Woninglabels Enabled="True" Priority="1">
<Woninglabel Id="1" Type="Huishoudgrootte" Enabled="True" SeqNumber="1" SortDirection="DESC" Min="0" Max="99" Text="" />
<Woninglabel Id="2" Type="Leeftijd" Enabled="True" SeqNumber="2" SortDirection="DESC" Min="0" Max="999" Text="" />
</Woninglabels>
</Filter>
</Advertentie>
XML
The information I need are the Min and Max values for every Woninglabel Id (just pasted two, but there are over 10 Woninglabels). The problem I’m running into is that all the information is between just two “<” “>”.
I tried using the .value()
method, but soon ran into the problem that all the information is just between two of those “<” “>”.
select TOP 100 ic,
CAST(ic AS XML).value('(<Advertentie><Filter><Woninglabels Enabled="True" Priority="1"><Woninglabel Id="1" Type="Huishoudgrootte" Enabled="True" SeqNumber="1" SortDirection="DESC" Min=")[1]','int(2)') AS Name
from adv
And got this error:
error
10
This is very well structured XML. Unstructured would be something like name="Price" value="1,234" type="good_luck_parsing_this"
. There are several ways to query XML. You can use .value
to extract specific values based on an XPath … path, you can use query
to extract elements as XML, or use .nodes()
to extract data as a rowset that can be part of a larger query.
If the XML was stored in a variable, you could use this to extract the label data :
select
a.value('./@Id','int') as Id,
a.value('./@Type','nvarchar(20)') as Type,
a.value('./@Enabled','bit') as Enabled,
a.value('./@SeqNumber','int') as SeqNumber,
a.value('./@Min','int') as Min,
a.value('./@Max','int') as Max
from @xml.nodes('//Woninglabel') x(a)
---
Id Type Enabled SeqNumber Min Max
1 Huishoudgrootte 1 1 0 99
2 Leeftijd 1 2 0 999
Or this to extract just one label’s data :
select
a.value('./@Id','int') as Id,
a.value('./@Type','nvarchar(20)') as Type,
a.value('./@Enabled','bit') as Enabled,
a.value('./@SeqNumber','int') as SeqNumber,
a.value('./@Min','int') as Min,
a.value('./@Max','int') as Max
from @xml.nodes('//Woninglabel') x(a)
where a.value('./@Id','int')=1
---
Id Type Enabled SeqNumber Min Max
1 Huishoudgrootte 1 1 0 99
With a table column whose type isn’t XML, you can use a CTE or subquery to first parse the text into XML. You could put this “shredding” code into a view too, to reduce the complexity of the query
declare @t table (OuterID bigint identity primary key,x nvarchar(max));
insert into @t values (@txt);
with parsed as (
select *, cast(x as xml) xx1 from @t
)
select OuterID,
a.value('./@Id','int') as Id,
a.value('./@Type','nvarchar(20)') as Type,
a.value('./@Enabled','bit') as Enabled,
a.value('./@SeqNumber','int') as SeqNumber,
a.value('./@Min','int') as Min,
a.value('./@Max','int') as Max
from parsed
cross apply xx1.nodes('//Woninglabel') x(a)
---
OuterID Id Type Enabled SeqNumber Min Max
1 1 Huishoudgrootte 1 1 0 99
1 2 Leeftijd 1 2 0 999
Those columns can be used for filtering, just like any other column :
with parsed as (
select *, cast(x as xml) xx1 from @t
),
labels as (
select OuterID,
a.value('./@Id','int') as Id,
a.value('./@Type','nvarchar(20)') as Type,
a.value('./@Enabled','bit') as Enabled,
a.value('./@SeqNumber','int') as SeqNumber,
a.value('./@Min','int') as Min,
a.value('./@Max','int') as Max
from parsed
cross apply xx1.nodes('//Woninglabel') x(a)
)
select * from labels
where SeqNumber=2