XML—XPATH syntax introduction
Why do we need xpath?
When using dom4j, we cannot obtain a certain element across layers. We must obtain it layer by layer, which is very troublesome.
So in order for us to access a certain node more conveniently, we can use xpath technology, which allows us to read the specified node very conveniently.
xpath is usually used in conjunction with dom4j, And if you want to use xpath, you need to introduce a new package jaxen-1.1-beta-6.jar
The basic syntax of xpath has the following points:
1. The basic xpath syntax is similar to locating files in a file system. If the path starts with a slash /
starts, then the path represents the absolute path to an element.
(1) /AAA
, which represents the selection of the root element AAA
<AAA>这里 <BBB/> <CCC/> <BBB/> <BBB/> <DDD> <BBB/> <DDD/> <CCC/><AAA/>这里
(2) /AAA/CCC
, indicating the selection of all CCC sub-elements of AAA
<AAA> <BBB/> <CCC/>这里 <BBB/> <BBB/> <DDD> <BBB/> <DDD/> <CCC/>这里<AAA/>
(3) /AAA/DDD/BBB
, indicating the selection All BBB sub-elements of AAA's sub-elements DDD
<AAA> <BBB/> <CCC/> <BBB/> <BBB/> <DDD> <BBB/>这里 <DDD/> <CCC/><AAA/>
So how to use xpath in dom4j? It's actually very simple:
//1.得到SAXReader解析器SAXReader saxReader = new SAXReader(); //2.指定去解析哪个文件Document document = saxReader.read(new File(path)); //3.可以使用xpath随心读取// document.selectNodes(args)返回多个元素 // document.selectSingleNode(args)返回单个元素List nodes = document.selectNodes("/AAA/BBB");
After getting the document object through dom4j, you can use the document's selectNodes(args)
method. This method will return a List# based on the xpath path you wrote. ##, the remaining operations are similar to dom4j.
selectSingleNode(args) method, which is used to return a single Node.
The following continues to introduce other xpath syntax:
2. If the path starts with a double slash //, it means that all the files in the document satisfy Elements of the rules after the double slash
// (regardless of hierarchical relationship)
//BBB, which means selecting all BBB elements
<AAA> <BBB/>这里 <CCC/> <BBB/>这里 <DDD> <BBB/>这里 </DDD> <CCC> <DDD> <BBB/>这里 <BBB/>这里 </DDD> </CCC></AAA>
//DDD/BBB, indicating that all parent elements are BBB elements of DDD
<AAA> <BBB/> <CCC/> <BBB/> <DDD> <BBB/>这里 </DDD> <CCC> <DDD> <BBB/>这里 <BBB/>这里 </DDD> </CCC></AAA>
3. Asterisk* means selecting all elements located by the path before the asterisk
/AAA/CCC/DDD/*, which means selecting all paths attached to / Elements of AAA/CCC/DDD:
<AAA> <XXX> <DDD> <BBB/> <BBB/> <EEE/> <FFF/> </DDD> </XXX> <CCC> <DDD> <BBB/>这里 <BBB/>这里 <EEE/>这里 <FFF/>这里 </DDD> </CCC> <CCC> <BBB> <BBB> <BBB/> </BBB> </BBB> </CCC></AAA>
/*/*/*/BBB, which represents all BBB elements with 3 ancestor elements
<AAA> <XXX> <DDD> <BBB/>这里 <BBB/>这里 <EEE/> <FFF/> </DDD> </XXX> <CCC> <DDD> <BBB/>这里 <BBB/>这里 <EEE/> <FFF/> </DDD> </CCC> <CCC> <BBB>这里 <BBB> <BBB/> </BBB> </BBB> </CCC></AAA>
//*, which means selecting all elements
4. The expressions in square brackets can further specify the elements, where the numbers represent The position of the element in the selection set, and the last() function represents the last element in the selection set. It is important to note that the subscripts here start from 1, not 0! (1)
/AAA/BBB[1], which means selecting the first BBB sub-element of AAA
<AAA> <BBB/>这个 <BBB/> <BBB/> <BBB/></AAA>
/AAA/ BBB[last()] means selecting the last BBB element of AAA
<AAA> <BBB/> <BBB/> <BBB/> <BBB/>这个</AAA>
5. Operations on attributes
(1)//@id, select all id attributes. Note: all id attributes are returned as nodes, not nodes with id attributes.
<AAA> <BBB id="b1"/>返回这里的id属性节点 <BBB id="b2"/>也返回这里的id属性节点 <BBB name="bbb"/> <BBB/></AAA>
//BBB[@id], select all BBB nodes with id attributes
<AAA> <BBB id="b1"/>返回这个BBB节点 <BBB id="b2"/>也返回这个BBB节点 <BBB name="bbb"/> <BBB/></AAA>
//BBB[@ name], select all BBB nodes with name attribute
<AAA> <BBB id="b1"/> <BBB id="b2"/> <BBB name="bbb"/>返回这个BBB节点 <BBB/></AAA>
//BBB[@*], select all BBB nodes with attribute
<AAA> <BBB id="b1"/>返回这个BBB节点 <BBB id="b2"/>返回这个BBB节点 <BBB name="bbb"/>返回这个BBB节点 <BBB/></AAA>
//BBB[not(@*)], select all BBB nodes without attributes
<AAA> <BBB id="b1"/> <BBB id="b2"/> <BBB name="bbb"/> <BBB/>这个</AAA>
6. The value of the attribute can be used As the selection criteria
(1)//BBB[@id='b1'], select the BBB element that contains the attribute id and its value is 'b1'
<AAA> <BBB id="b1"/>这个 <BBB name="bbb"/> <BBB name="bbb"/></AAA>
7.count()The function can count the number of selected elements
//* [count(BBB)=2], select the element containing 2 BBB sub-elements
<AAA> <CCC> <BBB/> <BBB/> <BBB/> </CCC> <DDD>返回这个元素 <BBB/> <BBB/> </DDD> <EEE> <CCC/> <DDD/> </EEE></AAA>
//*[count(*)=2], select Elements containing 2 sub-elements
<AAA> <CCC> <BBB/> <BBB/> <BBB/> </CCC> <DDD>返回这个元素 <BBB/> <BBB/> </DDD> <EEE>也返回这个元素 <CCC/> <DDD/> </EEE></AAA>
There are many other syntaxes, including the application of many functions, which are not used much and will not be introduced here
In addition, the syntax points introduced above can be combined in any combination, such as the following xml document:
<AAA> <BBB id="b1"> <CCC> <KKK>k1</KKK> </CCC> <CCC> <KKK>k2</KKK>这个 </CCC> </BBB> <BBB id="b2"/> <BBB name="bbb"/></AAA>
/AAA/BBB[1]/CCC[2]/KKK
So in order for us to access a certain node more conveniently, we can use xpath technology, which allows us to read the specified node very conveniently.
The basic syntax of xpath has the following points:xpath is usually used in conjunction with dom4j, And if you want to use xpath, you need to introduce a new package jaxen-1.1-beta-6.jar
1. The basic xpath syntax is similar to locating files in a file system. If the path starts with a slash / starts, then the path represents the absolute path to an element.
/AAA, which represents the selection of the root element AAA
<AAA>这里 <BBB/> <CCC/> <BBB/> <BBB/> <DDD> <BBB/> <DDD/> <CCC/><AAA/>这里
/AAA/CCC, indicating the selection of all CCC sub-elements of AAA
<AAA> <BBB/> <CCC/>这里 <BBB/> <BBB/> <DDD> <BBB/> <DDD/> <CCC/>这里<AAA/>
/AAA/DDD/BBB, indicating the selection All BBB sub-elements of AAA's sub-elements DDD
<AAA> <BBB/> <CCC/> <BBB/> <BBB/> <DDD> <BBB/>这里 <DDD/> <CCC/><AAA/>
那么怎么在dom4j中运用xpath呢?其实很简单:
//1.得到SAXReader解析器SAXReader saxReader = new SAXReader(); //2.指定去解析哪个文件Document document = saxReader.read(new File(path)); //3.可以使用xpath随心读取 // document.selectNodes(args)返回多个元素 // document.selectSingleNode(args)返回单个元素List nodes = document.selectNodes("/AAA/BBB");
通过dom4j得到document对象后,可以使用document的selectNodes(args)
方法,这个方法会根据你写的xpath路径返回一个List
,余下的操作就和dom4j类似了。
同时它也有一个selectSingleNode(args)
方法,用于返回一个单个的Node。
下面继续介绍其他的xpath语法:
2.如果路径以双斜线//
开头,则表示文档中所有满足双斜线//
之后规则的元素(无论层级关系)
(1)//BBB
,它表示选择所有BBB元素
<AAA> <BBB/>这里 <CCC/> <BBB/>这里 <DDD> <BBB/>这里 </DDD> <CCC> <DDD> <BBB/>这里 <BBB/>这里 </DDD> </CCC></AAA>
(2)//DDD/BBB
,表示所有父元素是DDD的BBB元素
<AAA> <BBB/> <CCC/> <BBB/> <DDD> <BBB/>这里 </DDD> <CCC> <DDD> <BBB/>这里 <BBB/>这里 </DDD> </CCC></AAA>
3.星号*
表示选择所有由星号之前路径所定位的元素
(1)/AAA/CCC/DDD/*
,它表示选择所有路径依附于/AAA/CCC/DDD的元素:
<AAA> <XXX> <DDD> <BBB/> <BBB/> <EEE/> <FFF/> </DDD> </XXX> <CCC> <DDD> <BBB/>这里 <BBB/>这里 <EEE/>这里 <FFF/>这里 </DDD> </CCC> <CCC> <BBB> <BBB> <BBB/> </BBB> </BBB> </CCC></AAA>
(2)/*/*/*/BBB
,它表示所有的有3个祖先元素的BBB元素
<AAA> <XXX> <DDD> <BBB/>这里 <BBB/>这里 <EEE/> <FFF/> </DDD> </XXX> <CCC> <DDD> <BBB/>这里 <BBB/>这里 <EEE/> <FFF/> </DDD> </CCC> <CCC> <BBB>这里 <BBB> <BBB/> </BBB> </BBB> </CCC></AAA>
(3)//*
,它表示选择所有的元素
4.方括号里的表达式可以进一步地指定元素,其中数字表示元素在选择集里的位置,而last()函数则表示选择集中的最后一个元素。特别要注意的是这里的下标是从1开始的,而不是0!
(1)/AAA/BBB[1]
,它表示选择AAA的第一个BBB子元素
<AAA> <BBB/>这个 <BBB/> <BBB/> <BBB/></AAA>
(2)/AAA/BBB[last()]
,表示选择AAA的最后一个BBB元素
<AAA> <BBB/> <BBB/> <BBB/> <BBB/>这个</AAA>
5.对属性的操作
(1)//@id
,选择所有的id属性,注意:是把所有的id属性当做节点返回,而不是返回有id属性的节点。
<AAA> <BBB id="b1"/>返回这里的id属性节点 <BBB id="b2"/>也返回这里的id属性节点 <BBB name="bbb"/> <BBB/></AAA>
(2)//BBB[@id]
,选择所有有id属性的BBB节点
<AAA> <BBB id="b1"/>返回这个BBB节点 <BBB id="b2"/>也返回这个BBB节点 <BBB name="bbb"/> <BBB/></AAA>
(3)//BBB[@name]
,选择所有有name属性的BBB节点
<AAA> <BBB id="b1"/> <BBB id="b2"/> <BBB name="bbb"/>返回这个BBB节点 <BBB/></AAA>
(4)//BBB[@*]
,选择所有有属性的BBB节点
<AAA> <BBB id="b1"/>返回这个BBB节点 <BBB id="b2"/>返回这个BBB节点 <BBB name="bbb"/>返回这个BBB节点 <BBB/></AAA>
(5)//BBB[not(@*)]
,选择所有没有属性的BBB节点
<AAA> <BBB id="b1"/> <BBB id="b2"/> <BBB name="bbb"/> <BBB/>这个</AAA>
6.属性的值可以被用来作为选择的准则
(1)//BBB[@id='b1']
,选择含有属性id且其值为’b1’的BBB元素
<AAA> <BBB id="b1"/>这个 <BBB name="bbb"/> <BBB name="bbb"/></AAA>
7.count()
函数可以计数所选元素的个数
(1)//*[count(BBB)=2]
,选择含有2个BBB子元素的元素
<AAA> <CCC> <BBB/> <BBB/> <BBB/> </CCC> <DDD>返回这个元素 <BBB/> <BBB/> </DDD> <EEE> <CCC/> <DDD/> </EEE></AAA>
(2)//*[count(*)=2]
,选择含有2个子元素的元素
<AAA> <CCC> <BBB/> <BBB/> <BBB/> </CCC> <DDD>返回这个元素 <BBB/> <BBB/> </DDD> <EEE>也返回这个元素 <CCC/> <DDD/> </EEE></AAA>
还有很多其他的语法,包括很多函数的应用,用的不多,这里不做介绍
另外,上述介绍的几点语法可以任意组合,比如下述的xml文档:
<AAA> <BBB id="b1"> <CCC> <KKK>k1</KKK> </CCC> <CCC> <KKK>k2</KKK>这个 </CCC> </BBB> <BBB id="b2"/> <BBB name="bbb"/></AAA>
假如我们现在要找AAA元素下面的第1个BBB子元素下面的第2CCC子元素的KKK子元素,则xpath路径应该这么写: /AAA/BBB[1]/CCC[2]/KKK
以上就是XML——XPATH语法介绍 的内容,更多相关内容请关注PHP中文网(www.php.cn)!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Can XML files be opened with PPT? XML, Extensible Markup Language (Extensible Markup Language), is a universal markup language that is widely used in data exchange and data storage. Compared with HTML, XML is more flexible and can define its own tags and data structures, making the storage and exchange of data more convenient and unified. PPT, or PowerPoint, is a software developed by Microsoft for creating presentations. It provides a comprehensive way of

Using Python to merge and deduplicate XML data XML (eXtensibleMarkupLanguage) is a markup language used to store and transmit data. When processing XML data, sometimes we need to merge multiple XML files into one, or remove duplicate data. This article will introduce how to use Python to implement XML data merging and deduplication, and give corresponding code examples. 1. XML data merging When we have multiple XML files, we need to merge them

Convert XML data in Python to CSV format XML (ExtensibleMarkupLanguage) is an extensible markup language commonly used for data storage and transmission. CSV (CommaSeparatedValues) is a comma-delimited text file format commonly used for data import and export. When processing data, sometimes it is necessary to convert XML data to CSV format for easy analysis and processing. Python is a powerful

Implementing filtering and sorting of XML data using Python Introduction: XML is a commonly used data exchange format that stores data in the form of tags and attributes. When processing XML data, we often need to filter and sort the data. Python provides many useful tools and libraries to process XML data. This article will introduce how to use Python to filter and sort XML data. Reading the XML file Before we begin, we need to read the XML file. Python has many XML processing libraries,

Python implements conversion between XML and JSON Introduction: In the daily development process, we often need to convert data between different formats. XML and JSON are common data exchange formats. In Python, we can use various libraries to convert between XML and JSON. This article will introduce several commonly used methods, with code examples. 1. To convert XML to JSON in Python, we can use the xml.etree.ElementTree module

Handling Errors and Exceptions in XML Using Python XML is a commonly used data format used to store and represent structured data. When we use Python to process XML, sometimes we may encounter some errors and exceptions. In this article, I will introduce how to use Python to handle errors and exceptions in XML, and provide some sample code for reference. Use try-except statement to catch XML parsing errors When we use Python to parse XML, sometimes we may encounter some

Python parses special characters and escape sequences in XML XML (eXtensibleMarkupLanguage) is a commonly used data exchange format used to transfer and store data between different systems. When processing XML files, you often encounter situations that contain special characters and escape sequences, which may cause parsing errors or misinterpretation of the data. Therefore, when parsing XML files using Python, we need to understand how to handle these special characters and escape sequences. 1. Special characters and

How to handle XML and JSON data formats in C# development requires specific code examples. In modern software development, XML and JSON are two widely used data formats. XML (Extensible Markup Language) is a markup language used to store and transmit data, while JSON (JavaScript Object Notation) is a lightweight data exchange format. In C# development, we often need to process and operate XML and JSON data. This article will focus on how to use C# to process these two data formats, and attach
