Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
kilian_kilger
Advisor
Advisor
762

The XML and XSLT support was part of the ABAP programming language for over twenty years. Most things stayed the same in the meantime. With ABAP Environment 2411 or ABAP release 913, three new functions from the XSLT/XPath 2.0 specification have been added to deal with string replacements and regular expressions.

1. Introduction

For years, ABAP has supported XSLT 1.0 but includes some functions and instructions from the XSLT 1.1 and XSLT 2.0 specifications, such as for-each-group from the XSLT 2.0 specification.

However, the ABAP implementation has been missing powerful functions for string replacements and matching. These are crucial for a language like XSLT, which is used to transform XML documents. This has seriously limited the usefulness of XSLT in ABAP.

In 2411, we introduced the XPATH functions matches, replace in the default namespace, and sap:tokenize in the SAP namespace. These behave as specified in the official W3C specification XQuery 1.0 and XPath 2.0 Functions and Operators.

This was made possible by introducing XPATH regular expression dialect in ABAP in SAP Basis 782 / SAP S/4HANA CE 2011, see the excellent blog post of Julius in Modern Regular Expressions in ABAP – Part 3. In that release, XPATH regular expressions were only allowed via the ABAP class CL_ABAP_REGEX and not in XPATH/XSLT itself. This is now to be changed.

Note that the regular expression dialect used for these functions is the dialect specified in the W3C specification and not the usual PCRE or POSIX regular expressions used more often in ABAP itself. 

2. The function "matches()"

The XPATH function matches() is used to find a regular expression inside a string.

Purpose: Find regular expression in a string
Arguments:  
    
1. string to be searched
    2. string containing the regular expression
    3. (optional) string with options (case insensitive search, ...)
Return type: boolean

Consider the following XML document:

<pets>

  <dog name="Cat" color="grey"/>
  <cat name=" Cat" color="white"/>
  <cat name="Cat" color="orange"/>
  <cat name="Cot" color="brown"/>
  <cat name="CAT Garfield" color="purple"/>
  <cat name="The Cat" color="pink"/>
</pets>

Consider the following XSLT snippet applied to the above XML document:

<xsl:for-each select="/pets/cat[matches(@name, '^C.t')]">
  <pet>
    <xsl:attribute name="name"><xsl:value-of select="@name"/></xsl:attribute>
    <xsl:attribute name="color"><xsl:value-of select="@color"/></xsl:attribute>
  </pet>
</xsl:for-each>

This will just return the lines:

<pet name="Cat" color="orange"/>
<pet name="Cot" color="brown"/>

Changing the matches-call to matches(@name, '^C.t', 'i') will do a case-insensitive search instead.

3. The function "replace()"

The replace function is similar to matches() but does a replacement. The replacement string can refer to submatches of the regular expression, as is known from similar functionality in ABAP or, e.g. vscode.

Purpose: Replace all non-overlapping occurrences of the regular expression in the search string with a replacement string
Arguments:  
    
1. string to be searched
    2. string containing the regular expression
    3. string containing the replacement
    4. (optional) string with options (case insensitive search, ...)
Return type: string with modifications

Note that the syntax of the replacement string in the XPATH standard is very weird in corner cases. Please look that up in the W3C specification if you are confused. 

Using the same XML as above, and the following XSLT script (note we do case-insensitive replacement)!

<xsl:for-each select="/pets/cat[matches(@name, '^C.t', 'i')]">
  <pet>
    <xsl:attribute name="name">
      <xsl:value-of select="replace(@name, '^C(.)t', '$1', 'i')"/>
    </xsl:attribute>
    <xsl:attribute name="color"><xsl:value-of select="@color"/></xsl:attribute>

  </pet>
</xsl:for-each>

returns the following result:

<pet name="a" color="orange"/>
<pet name="o" color="brown"/>
<pet name="A Garfield" color="purple"/>

4. The function tokenize()

The tokenize() function splits a string at separators, which can be given via a regular expression. 

Purpose: Split a string at a separator regular expression
Arguments:  
    
1. string to be searched
    2. string containing the regular expression
    3. (optional) string with options (case insensitive search, ...)
Return type: node-set of XML text nodes consisting of the split tokens 

The function tokenize() is special, as it returns a sequence of strings in the XPATH standard. A sequence is a datatype introduced with XPATH/XSLT 2.0 and does not exist in the XPATH implementation available in ABAP. So we changed the return type to return a node-set, i.e. a set of unrelated XML nodes consisting only of single strings. This behaves in all practical purposes like the standard tokenize() function. Nevertheless, the function is only exposed in the SAP namespace. 

Note there exists a one-argument version of tokenize() with special semantics. If you are interested, look it up in the standard.

We again use the same XML as above and apply the following XSLT snippet:

<xsl:for-each select="/pets/cat">
  <pet>
    <xsl:attribute name="name"><xsl:value-of select="@name"/></xsl:attribute>
    <xsl:attribute name="position"><xsl:value-of select="position()"/>
  
  </xsl:attribute>
    <xsl:for-each select="sap:tokenize(@name, '\s+')">
      <token>
        <xsl:copy-of select="."/>
      </token>
    </xsl:for-each>
  </pet>
</xsl:for-each>

This will return:

<pet name=" Cat" position="1">
  <token>Cat</token>
</pet>
<pet name="Cat" position="2">
  <token>Cat</token>
</pet>
<pet name="Cot" position="3">
  <token>Cot</token>
</pet>
<pet name="CAT Garfield" position="4">
  <token>CAT</token>
  <token>Garfield</token>
</pet>
<pet name="The Cat" position="5">
  <token>The</token>
  <token>Cat</token>
</pet>

5. Conclusion

This article introduced the new regular expression functions: matches, replace and tokenize in ABAP XSLT, which are the first new functions in ABAP XSLT for decades.

 

1 Comment