Application Development Discussions
Join the discussions or start your own on all things application development, including tools and APIs, programming models, and keeping your skills sharp.
cancel
Showing results for 
Search instead for 
Did you mean: 

Reverse XML Special Character Escaping

alex_campbell
Contributor
0 Kudos

Hi All,

I'm looking for a way to reverse XML Special Character escaping using ABAP. Essentially what I'm looking for is the reverse of the ESCAPE function. I have strings of data that contain escaped XML snippets, like below. I need to parse out the escaped special characters and write the result out to a file. The trivial way would just be to do a "REPLACE" of the 5 basic escaped characters, but I know that the standards are more complicated than that (for example, you can use " in place of " and characters in CDATA must not be escaped). I'd like my solution to be as completely standards-compliant as possible. Does anyone have any advice?

Thanks,

Alex

Example Escaped Snippet:


<?xml version="1.0" encoding="utf-8"?>

Desired Result:

<?xml version="1.0" encoding="utf-8"?>
1 ACCEPTED SOLUTION

Juwin
Active Contributor
0 Kudos

Hi Alex,

If you use standard methods provided by SAP to read & parse the XML file, this will be automatically done by the method. Example program is given below. My example XML string, has a node named body and that has a value with > symbol in it, which in escaped form looks like &gt;. The program correctly gets the value back converting &gt; to > symbol and outputs that after execution.


    1  report xml_parse.

    2 

    3  data:lv_xml    type ref to cl_xml_document,

    4       lv_elemnt type ref to if_ixml_element,

    5       lv_value  type string.

    6 

    7  create object lv_xml.

    8  lv_xml->parse_string( |<?xml version="1.0" encoding="UTF-8"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>&gt;Don't forget me this weekend!</body></note>| ).

    9  lv_elemnt ?= lv_xml->find_node( 'body' ).

   10  lv_value   = lv_elemnt->get_value( ).

   11  write lv_value.

Output from program:

Thanks,

Juwin

5 REPLIES 5

VenkatRamesh_V
Active Contributor
0 Kudos

This message was moderated.

0 Kudos

Hi Venkat,

It looks to me like the code you posted will replace the characters '[!@#$%^&*+-= ]' with a spaces. Is that right? If so, it's definitely not what I'm looking for. In my situation, I'm dealing specifcially with special characters that were escaped for XML (so a different set than the ones you've given). And in my case they've already been escaped, so I don't want to replace the special characters, I want to restore them.

Please take a look at the example I've given and let me know if you can help.

Thanks,

Alex

Juwin
Active Contributor
0 Kudos

Hi Alex,

If you use standard methods provided by SAP to read & parse the XML file, this will be automatically done by the method. Example program is given below. My example XML string, has a node named body and that has a value with > symbol in it, which in escaped form looks like &gt;. The program correctly gets the value back converting &gt; to > symbol and outputs that after execution.


    1  report xml_parse.

    2 

    3  data:lv_xml    type ref to cl_xml_document,

    4       lv_elemnt type ref to if_ixml_element,

    5       lv_value  type string.

    6 

    7  create object lv_xml.

    8  lv_xml->parse_string( |<?xml version="1.0" encoding="UTF-8"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>&gt;Don't forget me this weekend!</body></note>| ).

    9  lv_elemnt ?= lv_xml->find_node( 'body' ).

   10  lv_value   = lv_elemnt->get_value( ).

   11  write lv_value.

Output from program:

Thanks,

Juwin

0 Kudos

Thanks Juwin!

This is very helpful, unfortunately I've found my situation is more complicated than I had hoped.

The escaped XML snippets that I need to reverse contain material descriptions, and those material descriptions can also contain some of the special XML characters. By reversing the XML escaping, I'm also reversing the escaping of the contents of the material description, which breaks the resulting XML. Using the iXML classes, I have the following result:

Example Escaped Snipped:


&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;&lt;MATERIAL MAKTX=&quot;Material1&quot;(OneInch)&quot;&gt;&lt;MATERIAL&gt; 

Result of iXML GET_VALUE (Note the extra Quote in the MAXTX attribute):


<?xml version="1.0" encoding="utf-8"?><MATERIAL MAKTX="Material1"(OneInch)"><MATERIAL> 

Ideal Result (The quote for 1" would still be escaped, but the quotes that wrap the attribute value would be reversed):


<?xml version="1.0" encoding="utf-8"?><MATERIAL MAKTX="Material1&quot;(OneInch)"><MATERIAL> 

My intuition is that it's not possible for the system to know which escape characters need to be reversed, and which need to remain escaped in order for the resulting XML to be valid. Does anyone know if it's possible to solve this issue? Or are we out of luck?

Juwin
Active Contributor
0 Kudos

Your escaped XML seems incorrect.

If you escape

<?xml version="1.0" encoding="utf-8"?><MATERIAL MAKTX="Material1&quot;(OneInch)"><MATERIAL>


the result should be

&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;&lt;MATERIAL MAKTX=&quot;Material1&amp;quot;(OneInch)&quot;&gt;&lt;MATERIAL&gt;


and not

&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;&lt;MATERIAL MAKTX=&quot;Material1&quot;(OneInch)&quot;&gt;&lt;MATERIAL&gt;


Thanks,

Juwin