Tuesday, January 3, 2012

Handling special characters with FOR XML PATH('')

Suppose I want to concatenate the list of user databases on my system, in alphabetical order.
select
  stuff(
     (select ', ' + name -- Note the lack of column name
     from sys.databases
     where database_id > 4
     order by name
     for xml path('')
     )
   , 1, 2, '') as namelist;
This is easy, and it just works. The STUFF command lets me strip the leading comma and space, and my list of database names is done.
But if I wanted to list the names of database with triangular brackets around them… then that’s bit more complicated.
select
  stuff(
     (select ', <' + name + '>'
     from sys.databases
     where database_id > 4
     order by name
     for xml path('')
     )
   , 1, 2, '') as namelist;
It still runs, but I my results don’t show the triangular brackets, it shows <databasename>, <databasename2>. It’s not what I want to see. I effectively need to HTMLDecode the results – but T-SQL doesn’t have a function like that.
However, if I actually make my FOR XML call return actual well-formed XML, then I can extract the contents out, and it returns the data to me in its original (correct) form.
select
  stuff(
     (select ', <' + name + '>'
     from sys.databases
     where database_id > 4
     order by name
     for xml path(''), root('MyString'), type      ).value('/MyString[1]','varchar(max)')
   , 1, 2, '') as namelist;
To be valid XML, it needs a ROOT element. I also need to tell it to be XML, using the TYPE directive.
But this lets me hook into the value of /MyString[1], and return that as varchar(max).
And it works, my data comes back as , , etc.

I can also skip the ROOT element (but keep TYPE) and use .value('(./text())[1]','varchar(max)') - but for some reason it's always felt strange to me to be returning something which I declare to be XML (using TYPE) without it being well-formed.

No comments:

Post a Comment