cancel
Showing results for 
Search instead for 
Did you mean: 

How to use openxml() to list lower level nodes and filter on higher level nodes?

VolkerBarth
Contributor
7,029

Sigh. That's not the most attractive kind of question title, methinks...

I'm about to use openxml() to import from a typical XML file, and I'm looking how to parse subtrees. Basically, I want to select several nodes with particular entries and all subnotes below them.

In SQL, the tree structure would be designed via a set of FK-related tables (say, named "level_1" to "level_7"), where most child tables can have several rows per parent (i.e. as one-to-many relationships), so I would query something like

select level_4.name, level_5.name, level_7.value
from level_4 key join level_5 key join level_6 key join level_7
where level_4.name in ('value_1', 'value_2', 'value_10')
order by 1;

Of course, I could import the whole XML nodes into according tables, and then query that in a similar way, but that's not desired here.

Ideally, I would try to use one according openxml() call - but unfortunaley I'm running into several issues:

  • When the main xpath expression (i.e. the 2nd parameter of openxml) is focussed on a higher level (say, level_4), and I'm trying to request the lower level nodes (say, level_7.value) in the WITH clause, I will only get the first match per level_4 element, as the docs tell correctly
    When there are multiple matches for a column expression, the first match in the document order (the order of the original XML document before it was parsed) is used.

So the main xpath expression has to focus on the lowest level, if there are one-to-many relationships involved.

  • I can only set very simply filters in the main xpath expression, e.g. "level_1/[level_2="value_1"]". In particular, I seemingly cannot set multiple filters or combine them, as asked in this FAQ. So I guess the filtering has to be done afterwards, i.e. by using a WHERE clause on the final result set.

Question: If these limits do hold, are the following general guidelines correct?

  1. Focus the main xpath expression on the lowest level needed, so you get one entry per lowest level node.
  2. Use parent axis expressions in the WITH clause to access the desired parent nodes.
  3. Instead of filtering via xpath expression, use ordinary WHERE conditions to filter the result set.

So, for the hinted sample (assuming the values are designed as elements, not attributes):

select * from openxml(myXmlFile),
  '//level_7')
   with (level_4_name long varchar '../../../name',
         level_5_name long varchar '../../name',
         level_7_value long varchar 'value')
where level_4_name in ('value_1', 'value_2', 'value_10')
order by 1;

(The apparent drawback compared to directly filtering via xpath would be the fact that the whole XML file needs to be parsed, and openxml() may return a much bigger interim result set than needed.)

Or am I missing the point here completely?


I guess this is kind of a general question on how to "flatten" XML structures in a way similar to joining tables...

Accepted Solutions (1)

Accepted Solutions (1)

VolkerBarth
Contributor
0 Kudos

In lack of another response, I guess my above assumptions have proven correct over time. 🙂

Answers (0)