|
Page 2 of 6 Importing XML and XML Data In this section, we discuss some techniques for importing data from XML into a MySQL table. Using LOAD_FILE(). The simplest way to store XML in MySQL is to use the LOAD_FILE() function to open an entire XML document, make it available as a string, and insert this string into a table column. Using the 2cities.xml file created earlier, it is possible to do something like this: mysql> USE xmltest; mysql> CREATE TABLE xmldocs ( -> id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, -> doc_content BLOB NOT NULL, -> comment VARCHAR(100) NOT NULL DEFAULT '' -> ); Query OK, 0 rows affected (0.04 sec)
mysql> INSERT INTO xmldocs VALUES -> (NULL, LOAD_FILE('/tmp/2cities.xml'), '2 cities file'); Query OK, 1 row affected (0.00 sec)
mysql> SELECT * FROM xmldocs\G *************************** 1. row *************************** id: 1 doc_content: <?xml version="1.0"?>
<resultset statement="SELECT name,country FROM cities LIMIT 2 " xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <row> <field name="name">Mumbai (Bombay)</field> <field name="country">India</field> </row>
<row> <field name="name">Seoul</field> <field name="country">South Korea</field> </row> </resultset>
comment: 2 cities file 1 row in set (0.00 sec)
Importing mysqldump XML data into MySQL table columns using a stored procedure. Loading an entire XML file into a single row of a MySQL table solves the problem of getting the XML into MySQL, where it can be parsed using MySQL 5.1's XPath functions (see the section called “XPath Functionality”); however having to do so every time you want to access the data s not terribly convenient. However, MySQL developer Alexander Barkov has written a stored procedure xmldump_load that extracts data from the XML elements and attributes found in a file created using mysqldump --xml, and inserts this data into the columns of a MySQL table. The source of this stored procedure is shown here: DELIMITER |
DROP PROCEDURE IF EXISTS xmldump_load |
CREATE PROCEDURE xmldump_load(file_name VARCHAR(128), database_name VARCHAR(128), table_name VARCHAR(128)) BEGIN DECLARE xml TEXT;
DECLARE nrows INT; DECLARE rownum INT DEFAULT 1;
DECLARE ncols INT; DECLARE colnum INT DEFAULT 1;
DECLARE ins_list TEXT DEFAULT ''; DECLARE val_list TEXT DEFAULT '';
DECLARE tmp VARCHAR(255);
# load the XML file's contents into a string SET xml = LOAD_FILE(file_name);
# get the number of <row>s in this table SET nrows = ExtractValue(xml, 'count(/mysqldump/database[@name=$database_name]/table_data[@name=$table_name]/row)');
# get the number of <field>s (columns) in this table SET ncols = ExtractValue(xml, 'count(/mysqldump/database[@name=$database_name]/table_data[@name=$table_name]/row[1]/field)');
# for each <row> WHILE rownum <= nrows DO
# for each <field> (column) WHILE colnum <= ncols DO SET tmp = ExtractValue(xml, '/mysqldump/database[@name=$database_name]/table_data[@name=$table_name]/row[$rownum]/field[$colnum]/@name'); SET ins_list = CONCAT(ins_list, tmp, IF(colnum<ncols, ',', '')); SET tmp = ExtractValue(xml, '/mysqldump/database[@name=$database_name]/table_data[@name=$table_name]/row[$rownum]/field[$colnum]'); SET val_list = CONCAT(val_list, '''', tmp ,'''', IF(colnum<ncols, ',', '')); SET colnum = colnum + 1; END WHILE;
SET @ins_text = CONCAT('INSERT INTO t1 (', ins_list, ') VALUES (', val_list, ')');
SET ins_list = ''; SET val_list = '';
PREPARE stmt FROM @ins_text; EXECUTE stmt;
SET rownum = rownum + 1; SET colnum = 1; END WHILE; END |
DELIMITER ;
This stored procedure employs XPath user variables (see User variables in XPath expressions) and so runs only under MySQL 5.1.20 or later. You can find a copy of it in the file create-xmpdump-load.sql included with this article. Important The caller of this stored procedure must have the MySQL FILE privilege. You can test xmldump_load using the supplied test.sh script, shown here: # Demo for xmldump_load()
DB="test"
# Change "root" and "mypass" in the following 2 lines to # a user and password appropriate to your installation
MYSQL="mysql -uroot -pmypass --socket=/tmp/mysql.sock" MYSQLDUMP="mysqldump -uroot -pmypass --socket=/tmp/mysql.sock"
# Creates a test table with two columns and fills it with some data
$MYSQL $DB <> END SELECT VERSION(); DROP TABLE IF EXISTS t1; CREATE TABLE t1 (a INT, b VARCHAR(128)); INSERT INTO t1 VALUES (1,'11111'); INSERT INTO t1 VALUES (2,'22222'); INSERT INTO t1 VALUES (3,'33333'); INSERT INTO t1 VALUES (4,'44444'); INSERT INTO t1 VALUES (5,'55555'); INSERT INTO t1 VALUES (6,'66666'); INSERT INTO t1 VALUES (7,'77777'); INSERT INTO t1 VALUES (8,'88888'); INSERT INTO t1 VALUES (9,'99999'); END
# Dumps data into an XML file
$MYSQLDUMP --xml $DB t1 > /tmp/t1.xml
# Empties the table
$MYSQL --execute="DELETE FROM t1" $DB
# Creates the procedure, calls it, and # makes sure we've restored all records
$MYSQL $DB << END \. create-xmldump-load.sql CALL xmldump_load('/tmp/t1.xml', 'test', 't1'); SELECT * FROM t1; END
# Performs cleanup
# Comment out the remaining lines if you wish to # preserve the stored procedure, table, and XML # file following the test run
$MYSQL $DB << END DROP PROCEDURE xmldump_load; DROP TABLE t1; END
rm /tmp/t1.xml
Using LOAD XML. An implementation contributed by Erik Wetterberg of a new SQL statement has been accepted for MySQL 6.0, and is available beginning with version 6.0.3. LOAD XML greatly simplifies the task of importing data from an XML file into a MySQL table, without having to use the stored procedure discussed previously (see Importing mysqldump XML data into MySQL table columns using a stored procedure). The syntax for this statement is as shown here: LOAD XML [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'filename' [REPLACE | IGNORE] INTO TABLE [db_name.]tbl_name [CHARACTER SET charset_name] [ROWS IDENTIFIED BY '<tagname>'] [IGNORE number [LINES | ROWS]] [(column_or_user_var,...)] [SET col_name = expr,...]
This statement reads data from an XML file into a table, and acts as the complement mysql or mysqldump in XML output mode (that is, using the --xml option). The filename must be given as a literal string. The tagname in the optional ROWS IDENTIFIED BY clause must also be given as a literal string, and must be surrounded by angle brackets (< and >). LOAD DATA clauses used in LOAD XML. If you have used MySQL's LOAD DATA statement before, then you should find the following clauses familiar, since they work in essentially the same way for LOAD XML as they do for LOAD DATA: - LOW_PRIORITY or CONCURRENT
- LOCAL
- REPLACE or IGNORE
- CHARACTER SET
- The IGNORE number LINES clause is analogous to the LOAD DATA statement's IGNORE ... LINES clause. LOAD XML also accepts IGNORE number ROWS. In either case, the clause causes the first number rows in the XML file to be skipped, and not to be imported.
- (column_or_user_var,...)
- SET
See LOAD DATA INFILE Syntax in the MySQL 5.1 Manual for more information about these clauses. ROWS IDENTIFIED BY '<tagname>'. Accepted input formats.LOAD XML supports three different XML formats: - Attributes are interpreted as column names, and attribute values as interpreted as column values:
<row column1="value1" column2="value2" .../>
- Tag names are interpreted as column names, and the content of these tags are interpreted as column values:
<row> <column1>value1</column1> <column2>value2</column2> </row>
- Table column names are derived from the name attributes of <field> tags, and column values are taken from the contents of these tags:
<row> <field name='column1'>value1</field> <field name='column2'>value2</field> </row>
This is the format used by MySQL tools such as mysqldump The import routine used by LOAD XML automatically detects the format used for each row and interprets it correctly, matching based on the tag or attribute name and the column name. You can easily verify this for yourself, by creating an XML file that uses any two or even all three formats and then using LOAD XML to import it into a table. Important You must have the FILE privilege to use LOAD XML.
|