|
Page 4 of 6 XPath support in MySQL 5.1 and 6.0 In this section, we discuss the XPath functionality provided for use with the ExtractValue() and UpdateXML() function in MySQL 5.1 and MySQL 6.0. Supported functions and operators. Basic XPath expressions (known as locators) are supported. This includes locators using the following operators: - / (slash) operator. This operator acts in a manner similar to how a slash behaves in a Unix filesystem path, where the leading / represents the root of the document, and the identifier following it is the name of an XML element. For example, /book matches a top-level book element. Multiple slashes can be used to trace a branch; for example, /book/chapter matches <book><chapter/></book> where book is a top-level element and chapter is a child of that element. A leading double-slash (//) means that the pattern is matched anywhere in the XML document; for example, the locator //section matches a section element wherever it is found, regardless of this element' relationship to any other elements in the document.
- * (wildcard) operator. This operator matches any element. For example, //chapter/* matches any element that is a child of a chapter element, anywhere in the document, and /*/section matches any section that is a child of a top-level element.
- | (UNION) operator. This operator can be used to combine locators when you wish to match against any of them. For example, //section|//paragraph matches any section or paragraph element, anywhere in the document.
- Attribute matching. You can match an element by the value of one (or more) of its attributes using the syntax element [@ attribute = value ]. For example, //section[@id='xpath-locators'] matches a section anywhere in the document having an id attribute with the value “ xpath-locators ”.
Note You can match against multiple attribute values by merely combining them. For instance, the locator //paragraph[@role='intro'][@title='XPath support'] matches any paragraph element (anywhere in the XML document to be tested) having a role attribute whose value is “ intro” and a title attribute whose value is “ XPath support”. To find elements for which the same attribute matches one of several values, you can use multiple locators joined by the | operator. For example, to match all paragraph elements, anywhere in the XML to be tested, whose title attributes have either of the values “ Example” or “ Syntax ”, you would use the expression //paragraph[@title="Example"]|//paragraph[@title="Syntax"]. You can also use the logical or operator for this purpose: //paragraph[@title="Example" or @title="Syntax"]. The difference between or and | is that or joins conditions, while | joins result sets. - Namespaces are not explicitly supported. However, element names containing : are allowed, so you can work with XML markup that uses namespace notation. For example, //person:biography matches the tag <person:biography/> anywhere in the document.
- All of the standard XPath comparison operators (or, and, =, !=, <=, <, >=, and >) are supported. For example, consider the following:
mysql> SET @xml = '<foo bar="2">123</foo><foo bar="6">456</foo>'; Query OK, 0 rows affected (0.00 sec)
mysql> SELECT ExtractValue(@xml, '//foo[@bar="2"]'); +---------------------------------------+ | extractvalue(@xml, '//foo[@bar="2"]') | +---------------------------------------+ | 123 | +---------------------------------------+ 1 row in set (0.00 sec)
mysql> SELECT ExtractValue(@xml, '//foo[@bar>"2"]'); +---------------------------------------+ | extractvalue(@xml, '//foo[@bar>"2"]') | +---------------------------------------+ | 456 | +---------------------------------------+ 1 row in set (0.00 sec)
mysql> SELECT ExtractValue(@xml, '//foo[@bar>="2"]'); +----------------------------------------+ | extractvalue(@xml, '//foo[@bar>="2"]') | +----------------------------------------+ | 123 456 | +----------------------------------------+ 1 row in set (0.01 sec)
Notice that when multiple matches are found, ExtractValue() returns them in a single space-delimited string. - You can also use indexing in locators to identify particular elements of interest, as we showed earlier with //row[2]/field[1].
ExtractValue(), as it name suggests, obtains a value; it does not return any XML elements. Here is an example that should help make this clear: mysql> SET @xml = '<a>3<b><c>5<d/></c></b></a>'; Query OK, 0 rows affected (0.00 sec)
mysql> SELECT ExtractValue(@xml, '//a'), ExtractValue(@xml, '//b'), ExtractValue(@xml, '//c'); +---------------------------+---------------------------+---------------------------+ | ExtractValue(@xml, '//a') | ExtractValue(@xml, '//b') | ExtractValue(@xml, '//c') | +---------------------------+---------------------------+---------------------------+ | 3 | | 5 | +---------------------------+---------------------------+---------------------------+ 1 row in set (0.01 sec)
ExtractValue(@xml, '//a') returns only the content of any <a> elements; it does not return any of the elements <b>, <c>, or <d>, or any of the content of these elements. ExtractValue(@xml, '//c') returns only the content of any <c> elements. Because the sole <b> element contains no text, but rather only other XML elements, ExtractValue(@xml, '//b') returns an empty string. Unsupported XPath functions and operators. Many XPath functions and operators are supported; however, MySQL's XPath support is still under development, and so some of these are not yet implemented. These include the following limitations: -
Relative locator expressions are resolved in the context of the root node. For example, consider the following query and result: mysql> SELECT ExtractValue( -> '<book> -> <paragraph title="Example">P1</paragraph> -> <paragraph title="Syntax">P2</paragraph> -> </book>', -> 'book/paragraph' -> ) AS paras; +-------+ | paras | +-------+ | P1 P2 | +-------+ 1 row in set (0.03 sec) In this case, the locator book/paragraph is resolved as /book/paragraph. Relative locators are also supported within predicates. In the following example, ../@title="Example" is resolved as /book/paragraph/@title="Example": mysql> SELECT ExtractValue( -> '<book> -> <paragraph title="Example"><body>P1</body></paragraph> -> <paragraph title="Syntax"><body>P2</body></paragraph> -> </book>', -> 'book/paragraph/body[../@title="Example"]') -> AS para; +------+ | para | +------+ | P1 | +------+ 1 row in set (0.00 sec)
-
The :: operator is not supported in combination with node types such as axis ::comment(), axis ::text(), axis ::processing-instructions(), and axis ::node(). However, name tests (such as axis :: name and axis ::*) are supported, as shown in these examples: mysql> SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b'); +-------------------------------------------------------+ | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b') | +-------------------------------------------------------+ | x | +-------------------------------------------------------+ 1 row in set (0.02 sec)
mysql> SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*'); +-------------------------------------------------------+ | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*') | +-------------------------------------------------------+ | x y | +-------------------------------------------------------+ 1 row in set (0.01 sec)
-
“Up-and-down” navigation is not supported in cases where the path would lead “above” the root element. That is, you cannot use expressions which match on descendants of ancestors of a given element, where one or more of the ancestors of the current element is also an ancestor of the root element (see Bug #16321). -
The following XPath functions are not supported: id(), lang(), local-name(), name() (a patch has recently been submitted to implement this function, and is likely to appear in MySQL 6.0), namespace-uri(), normalize-space(), starts-with(), string(), substring-after(), substring-before(), and translate(). -
The following axes are not supported: following-sibling, following, preceding-sibling, and preceding. Error handling. For both ExtractValue() and UpdateXML(), the XPath locator used must be valid and the XML to be searched must be well-formed. If the locator is invalid, an error is generated: mysql> SELECT ExtractValue('<foo bar="2">123</foo>', '//foo[@bar>="2"]'); +------------------------------------------------------------+ | ExtractValue('<foo bar="2">123</foo>', '//foo[@bar>="2"]') | +------------------------------------------------------------+ | 123 | +------------------------------------------------------------+ 1 row in set (0.00 sec)
mysql> SELECT ExtractValue('<foo bar="2">123</foo>', '//foo@bar>="2"]');'); ERROR 1105 (HY000): XPATH syntax error: '@bar>="2"]'
If the XML to be searched is not well-formed, then NULL is returned, and a warning is issued: mysql> SELECT ExtractValue('<foo bar="2"123</foo>', '//foo[@bar>="2"]'); +-----------------------------------------------------------+ | ExtractValue('<foo bar="2"123</foo>', '//foo[@bar>="2"]');') | +-----------------------------------------------------------+ | NULL | +-----------------------------------------------------------+ 1 row in set, 1 warning (0.00 sec)
mysql> SHOW WARNINGS; +---------+------+--------------------------------------------------------------------------------------------+ | Level | Code | Message| +---------+------+--------------------------------------------------------------------------------------------+ | Warning | 1522 | Incorrect XML value: 'parse error at line 1 pos 13: unknown token unexpected ('>' wanted)' | +---------+------+--------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
However, the replacement XML used as the third argument to UpdateXML() is not checked for well-formedness. Note Space does not permit a complete discussion of XPath syntax and usage here. For definitive information, see the XML Path Language (XPath) 1.0 standard. The Zvon.org XPath Tutorial also provides a useful resource for those who are new to XPath or who would like a refresher in XPath basics. User variables in XPath expressions. Beginning with MySQL 5.1.20, you can employ user variables in XPath locators in either (or both) of two forms: -
Weakly checked. A variable using the syntax $@ variable_name is not checked for type or for whether it has previously been assigned a value. No warnings or errors are issued by the server if a variable has the wrong type or is undefined. In other words, you are responsible for any typographical errors or omissions. For example, if you use $@myvairable instead of $@myvariable, and $@myvairable has not been assigned a value, then MySQL assumes that $@myvairable has a “none” value of the appropriate type, such as 0 or an empty string. Example. mysql> SET @xml = '<a><b>X</b><b>Y</b></a>'; Query OK, 0 rows affected (0.00 sec)
mysql> SET @i =1; @j = 2; Query OK, 0 rows affected (0.00 sec)
mysql> SELECT @i, ExtractValue(@xml, '//b[$@i]'); +------+--------------------------------+ | @i | ExtractValue(@xml, '//b[$@i]') | +------+--------------------------------+ | 1 | X | +------+--------------------------------+ 1 row in set (0.00 sec)
mysql> SELECT @j, ExtractValue(@xml, '//b[$@j]'); +------+--------------------------------+ | @j | ExtractValue(@xml, '//b[$@j]') | +------+--------------------------------+ | 2 | Y | +------+--------------------------------+ 1 row in set (0.00 sec)
mysql> SELECT @k, ExtractValue(@xml, '//b[$@k]'); +------+--------------------------------+ | @k | ExtractValue(@xml, '//b[$@k]') | +------+--------------------------------+ | NULL | | +------+--------------------------------+ 1 row in set (0.00 sec)
-
Strongly checked. A variable using the syntax $ variable_name can be declared and used with MySQL's XPath functions when they are called inside stored procedures. Such a variable is local to the stored procedure in which it is defined, and it is checked for type and value. Example. mysql> DELIMITER |
mysql> CREATE PROCEDURE myproc () -> BEGIN -> DECLARE i INT DEFAULT 1; -> DECLARE xml VARCHAR(25) DEFAULT '<a>X</a><a>Y</a><a>Z</a>'; -> -> WHILE i < 4 DO -> SELECT xml, i, ExtractValue(xml, '//a[$i]'); -> SET i = i+1; -> END WHILE; -> END | Query OK, 0 rows affected (0.01 sec)
mysql> DELIMITER ;
mysql> CALL myproc; +--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 1 | X | +--------------------------+---+------------------------------+ 1 row in set (0.00 sec)
+--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 2 | Y | +--------------------------+---+------------------------------+ 1 row in set (0.01 sec)
+--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 3 | Z | +--------------------------+---+------------------------------+ 1 row in set (0.01 sec)
Expressions containing user-defined variables of either sort must otherwise (except for notation) conform to the rules for XPath expressions containing variables as given in the XPath specification.
|