The XML and XSLT support was part of the ABAP programming language for over twenty years. Most things stayed the same in the meantime. With ABAP Environment 2411 or ABAP release 913, three new functions from the XSLT/XPath 2.0 specification have been added to deal with string replacements and regular expressions.
For years, ABAP has supported XSLT 1.0 but includes some functions and instructions from the XSLT 1.1 and XSLT 2.0 specifications, such as for-each-group from the XSLT 2.0 specification.
However, the ABAP implementation has been missing powerful functions for string replacements and matching. These are crucial for a language like XSLT, which is used to transform XML documents. This has seriously limited the usefulness of XSLT in ABAP.
In 2411, we introduced the XPATH functions matches, replace in the default namespace, and sap:tokenize in the SAP namespace. These behave as specified in the official W3C specification XQuery 1.0 and XPath 2.0 Functions and Operators.
This was made possible by introducing XPATH regular expression dialect in ABAP in SAP Basis 782 / SAP S/4HANA CE 2011, see the excellent blog post of Julius in Modern Regular Expressions in ABAP – Part 3. In that release, XPATH regular expressions were only allowed via the ABAP class CL_ABAP_REGEX and not in XPATH/XSLT itself. This is now to be changed.
Note that the regular expression dialect used for these functions is the dialect specified in the W3C specification and not the usual PCRE or POSIX regular expressions used more often in ABAP itself.
The XPATH function matches() is used to find a regular expression inside a string.
Purpose: Find regular expression in a string
Arguments:
1. string to be searched
2. string containing the regular expression
3. (optional) string with options (case insensitive search, ...)
Return type: boolean
Consider the following XML document:
<pets>
<dog name="Cat" color="grey"/>
<cat name=" Cat" color="white"/>
<cat name="Cat" color="orange"/>
<cat name="Cot" color="brown"/>
<cat name="CAT Garfield" color="purple"/>
<cat name="The Cat" color="pink"/>
</pets>
Consider the following XSLT snippet applied to the above XML document:
<xsl:for-each select="/pets/cat[matches(@name, '^C.t')]">
<pet>
<xsl:attribute name="name"><xsl:value-of select="@name"/></xsl:attribute>
<xsl:attribute name="color"><xsl:value-of select="@color"/></xsl:attribute>
</pet>
</xsl:for-each>
This will just return the lines:
<pet name="Cat" color="orange"/>
<pet name="Cot" color="brown"/>
Changing the matches-call to matches(@name, '^C.t', 'i') will do a case-insensitive search instead.
The replace function is similar to matches() but does a replacement. The replacement string can refer to submatches of the regular expression, as is known from similar functionality in ABAP or, e.g. vscode.
Purpose: Replace all non-overlapping occurrences of the regular expression in the search string with a replacement string
Arguments:
1. string to be searched
2. string containing the regular expression
3. string containing the replacement
4. (optional) string with options (case insensitive search, ...)
Return type: string with modifications
Note that the syntax of the replacement string in the XPATH standard is very weird in corner cases. Please look that up in the W3C specification if you are confused.
Using the same XML as above, and the following XSLT script (note we do case-insensitive replacement)!
<xsl:for-each select="/pets/cat[matches(@name, '^C.t', 'i')]">
<pet>
<xsl:attribute name="name">
<xsl:value-of select="replace(@name, '^C(.)t', '$1', 'i')"/>
</xsl:attribute>
<xsl:attribute name="color"><xsl:value-of select="@color"/></xsl:attribute>
</pet>
</xsl:for-each>
returns the following result:
<pet name="a" color="orange"/>
<pet name="o" color="brown"/>
<pet name="A Garfield" color="purple"/>
The tokenize() function splits a string at separators, which can be given via a regular expression.
Purpose: Split a string at a separator regular expression
Arguments:
1. string to be searched
2. string containing the regular expression
3. (optional) string with options (case insensitive search, ...)
Return type: node-set of XML text nodes consisting of the split tokens
The function tokenize() is special, as it returns a sequence of strings in the XPATH standard. A sequence is a datatype introduced with XPATH/XSLT 2.0 and does not exist in the XPATH implementation available in ABAP. So we changed the return type to return a node-set, i.e. a set of unrelated XML nodes consisting only of single strings. This behaves in all practical purposes like the standard tokenize() function. Nevertheless, the function is only exposed in the SAP namespace.
Note there exists a one-argument version of tokenize() with special semantics. If you are interested, look it up in the standard.
We again use the same XML as above and apply the following XSLT snippet:
<xsl:for-each select="/pets/cat">
<pet>
<xsl:attribute name="name"><xsl:value-of select="@name"/></xsl:attribute>
<xsl:attribute name="position"><xsl:value-of select="position()"/>
</xsl:attribute>
<xsl:for-each select="sap:tokenize(@name, '\s+')">
<token>
<xsl:copy-of select="."/>
</token>
</xsl:for-each>
</pet>
</xsl:for-each>
This will return:
<pet name=" Cat" position="1">
<token>Cat</token>
</pet>
<pet name="Cat" position="2">
<token>Cat</token>
</pet>
<pet name="Cot" position="3">
<token>Cot</token>
</pet>
<pet name="CAT Garfield" position="4">
<token>CAT</token>
<token>Garfield</token>
</pet>
<pet name="The Cat" position="5">
<token>The</token>
<token>Cat</token>
</pet>
This article introduced the new regular expression functions: matches, replace and tokenize in ABAP XSLT, which are the first new functions in ABAP XSLT for decades.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
15 | |
15 | |
13 | |
11 | |
11 | |
9 | |
8 | |
7 | |
7 | |
6 |