<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Select text between two tags in Application Development and Automation Discussions</title>
    <link>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992336#M1697326</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi, Jorg&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I need to care with all pairs of tags separately, I mean, if we have &amp;lt;SPAN STYLE="…"&amp;gt;…&lt;EM&gt;&amp;lt;SPAN STYLE="…"&amp;gt;…&amp;lt;/SPAN&amp;gt;&lt;/EM&gt;&amp;lt;/SPAN&amp;gt;, so we need to care of both SPAN tags.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, I've paid attention, that in text, you proposed as example:&lt;/P&gt;&lt;P&gt;&lt;STRONG style="color: #000000; font-size: 12px; background-color: #f8f8f8; font-family: helvetica, arial;"&gt;text&amp;nbsp; = '&amp;lt;tag&amp;gt;this&amp;lt;/tag&amp;gt;asdfasdf&amp;lt;/tag&amp;gt;asdf&amp;lt;tag&amp;gt;all&amp;lt;/tag&amp;gt;&amp;lt;/tag&amp;gt;as&amp;lt;tag&amp;gt;matches&amp;lt;/tag&amp;gt;'.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000; font-family: helvetica, arial; font-size: 12px; background-color: #f8f8f8;"&gt;There is an issue with hierarchy of tags, such as the third tag «&amp;lt;/tag&amp;gt;» is a closing tag, but there is no needed open tag in order to build correct tag hierarchy.&lt;/SPAN&gt;&lt;STRONG style="color: #000000; font-size: 12px; background-color: #f8f8f8; font-family: helvetica, arial;"&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 12 Sep 2012 06:30:33 GMT</pubDate>
    <dc:creator>MikeB</dc:creator>
    <dc:date>2012-09-12T06:30:33Z</dc:date>
    <item>
      <title>Select text between two tags</title>
      <link>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992334#M1697324</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;How can I select the line in the text between two anchors?&lt;/P&gt;&lt;P&gt;For instance, we have the sentence: «I love &lt;EM&gt;&lt;STRONG&gt;&amp;lt;some_anchor&amp;gt;&lt;/STRONG&gt;&lt;/EM&gt;to visit old Europe cities&lt;EM&gt;&lt;STRONG&gt;&amp;lt;/some_anchor&amp;gt;&lt;/STRONG&gt;&lt;/EM&gt; on holidays».&lt;/P&gt;&lt;P&gt;So, I want to select and store in internal table the text, located between some tag, e.g. &lt;EM&gt;&lt;STRONG&gt;&amp;lt;some_anchor&amp;gt;&lt;/STRONG&gt;&lt;/EM&gt;.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Should I use RegEx (regular expressions) or the regular pattern is enough?&lt;/P&gt;&lt;P&gt;And how can I do it?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 11 Sep 2012 22:30:35 GMT</pubDate>
      <guid>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992334#M1697324</guid>
      <dc:creator>MikeB</dc:creator>
      <dc:date>2012-09-11T22:30:35Z</dc:date>
    </item>
    <item>
      <title>Re: Select text between two tags</title>
      <link>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992335#M1697325</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;It depends... are you trying to match just one specific tag pair, or are you trying to capture some XML/HTML type text? It's generally accepted that you can't parse html with &lt;A href="http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454"&gt;regex &lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regex will work, but be careful with it... try program DEMO_REGEX_TOY to mess around with your regular expressions. Consider the following:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt;&lt;SPAN class="L0S52"&gt;DATA &lt;/SPAN&gt;result &lt;SPAN class="L0S52"&gt;TYPE &lt;/SPAN&gt;match_result_tab.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt; &lt;SPAN class="L0S52"&gt;DATA &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;line &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;LIKE &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;LINE &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;OF &lt;/SPAN&gt;result.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt; &lt;SPAN class="L0S52"&gt;DATA &lt;/SPAN&gt;sub &lt;SPAN class="L0S52"&gt;TYPE &lt;/SPAN&gt;submatch_result.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt; &lt;SPAN class="L0S52"&gt;DATA &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;text &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;TYPE &lt;/SPAN&gt;string.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt; &lt;SPAN class="L0S52"&gt;text&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN class="L0S55"&gt;= &lt;/SPAN&gt;&lt;SPAN class="L0S33"&gt;'&amp;lt;tag&amp;gt;this&amp;lt;/tag&amp;gt;asdfasdf&amp;lt;/tag&amp;gt;asdf&amp;lt;tag&amp;gt;all&amp;lt;/tag&amp;gt;&amp;lt;/tag&amp;gt;as&amp;lt;tag&amp;gt;matches&amp;lt;/tag&amp;gt;'.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt; &lt;SPAN class="L0S52"&gt;FIND &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;ALL &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;OCCURRENCES &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;OF &lt;/SPAN&gt;REGEX &lt;SPAN class="L0S33"&gt;'&amp;lt;tag&amp;gt;((?:[^&amp;lt;]|&amp;lt;?!/tag&amp;gt;)*)&amp;lt;/tag&amp;gt;' &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;IN &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;text &lt;/SPAN&gt;IGNORING &lt;SPAN class="L0S52"&gt;CASE &lt;/SPAN&gt;RESULTS result.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt; &lt;SPAN class="L0S52"&gt;LOOP &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;AT &lt;/SPAN&gt;result &lt;SPAN class="L0S52"&gt;INTO &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;line.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt;&amp;nbsp;&amp;nbsp; &lt;SPAN class="L0S52"&gt;LOOP &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;AT &lt;/SPAN&gt;&lt;SPAN class="L0S52"&gt;line&lt;/SPAN&gt;&lt;SPAN class="L0S70"&gt;-&lt;/SPAN&gt;submatches &lt;SPAN class="L0S52"&gt;INTO &lt;/SPAN&gt;sub.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;SPAN class="L0S52"&gt;WRITE: &lt;/SPAN&gt;/ text+sub&lt;SPAN class="L0S70"&gt;-&lt;/SPAN&gt;offset&lt;SPAN class="L0S55"&gt;(&lt;/SPAN&gt;sub&lt;SPAN class="L0S70"&gt;-&lt;/SPAN&gt;length&lt;SPAN class="L0S55"&gt;).&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="L0S52" style="font-family: 'courier new', courier;"&gt;&amp;nbsp;&amp;nbsp; ENDLOOP.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="L0S52" style="font-family: 'courier new', courier;"&gt; ENDLOOP.&lt;/SPAN&gt;&lt;/P&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;the result of this is:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt;this&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt;all&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'courier new', courier;"&gt;matches&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;the regex bit &lt;SPAN style="font-family: 'courier new', courier;"&gt;&lt;STRONG&gt;'&amp;lt;tag&amp;gt;((?:[^&amp;lt;]|&amp;lt;?!/tag&amp;gt;)*)&amp;lt;/tag&amp;gt;'&lt;/STRONG&gt; &lt;SPAN style="font-family: arial, helvetica, sans-serif;"&gt;is slightly confusing looking because regex is lazy: it will not stop matching a sub query until it finds the last tag in the text. In my example, regex &lt;STRONG style="font-family: 'courier new', courier;"&gt;&amp;lt;tag&amp;gt;(.*)&amp;lt;/tag&amp;gt;&lt;/STRONG&gt; would match&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: arial, helvetica, sans-serif;"&gt;&lt;SPAN class="L0S33"&gt;'&amp;lt;tag&amp;gt;&lt;SPAN style="color: #ff0000;"&gt;this&amp;lt;/tag&amp;gt;asdfasdf&amp;lt;/tag&amp;gt;asdf&amp;lt;tag&amp;gt;all&amp;lt;/tag&amp;gt;&amp;lt;/tag&amp;gt;as&amp;lt;tag&amp;gt;matches&lt;/SPAN&gt;&amp;lt;/tag&amp;gt;'&lt;/SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 11 Sep 2012 23:54:20 GMT</pubDate>
      <guid>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992335#M1697325</guid>
      <dc:creator>Former Member</dc:creator>
      <dc:date>2012-09-11T23:54:20Z</dc:date>
    </item>
    <item>
      <title>Re: Select text between two tags</title>
      <link>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992336#M1697326</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi, Jorg&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I need to care with all pairs of tags separately, I mean, if we have &amp;lt;SPAN STYLE="…"&amp;gt;…&lt;EM&gt;&amp;lt;SPAN STYLE="…"&amp;gt;…&amp;lt;/SPAN&amp;gt;&lt;/EM&gt;&amp;lt;/SPAN&amp;gt;, so we need to care of both SPAN tags.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, I've paid attention, that in text, you proposed as example:&lt;/P&gt;&lt;P&gt;&lt;STRONG style="color: #000000; font-size: 12px; background-color: #f8f8f8; font-family: helvetica, arial;"&gt;text&amp;nbsp; = '&amp;lt;tag&amp;gt;this&amp;lt;/tag&amp;gt;asdfasdf&amp;lt;/tag&amp;gt;asdf&amp;lt;tag&amp;gt;all&amp;lt;/tag&amp;gt;&amp;lt;/tag&amp;gt;as&amp;lt;tag&amp;gt;matches&amp;lt;/tag&amp;gt;'.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #000000; font-family: helvetica, arial; font-size: 12px; background-color: #f8f8f8;"&gt;There is an issue with hierarchy of tags, such as the third tag «&amp;lt;/tag&amp;gt;» is a closing tag, but there is no needed open tag in order to build correct tag hierarchy.&lt;/SPAN&gt;&lt;STRONG style="color: #000000; font-size: 12px; background-color: #f8f8f8; font-family: helvetica, arial;"&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 12 Sep 2012 06:30:33 GMT</pubDate>
      <guid>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992336#M1697326</guid>
      <dc:creator>MikeB</dc:creator>
      <dc:date>2012-09-12T06:30:33Z</dc:date>
    </item>
    <item>
      <title>Re: Select text between two tags</title>
      <link>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992337#M1697327</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;My example only matches "&amp;lt;tag&amp;gt;", and not "&amp;lt;tag ..something else..&amp;gt;". You'll need to come up with a bit more elaborate opening tag regex.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Yeah, nested tags is an issue. Parsing html with regex is not a good idea. &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 12 Sep 2012 07:44:14 GMT</pubDate>
      <guid>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992337#M1697327</guid>
      <dc:creator>Former Member</dc:creator>
      <dc:date>2012-09-12T07:44:14Z</dc:date>
    </item>
    <item>
      <title>Re: Select text between two tags</title>
      <link>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992338#M1697328</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I just thought of something... perhaps the XML parsing classes can help you out. That's all tags as well.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 12 Sep 2012 07:47:23 GMT</pubDate>
      <guid>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992338#M1697328</guid>
      <dc:creator>Former Member</dc:creator>
      <dc:date>2012-09-12T07:47:23Z</dc:date>
    </item>
    <item>
      <title>Re: Select text between two tags</title>
      <link>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992339#M1697329</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-size: 12px; background-color: #ffffff;"&gt;Hi, I solved my issue.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="background-color: #ffffff;"&gt;D&lt;/SPAN&gt;&lt;SPAN style="background-color: #ffffff; color: #333333; font-size: 12px;"&gt;etailed explanation of the subject published in separate post on my blog.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #333333; font-size: 12px; background-color: #ffffff;"&gt;«Regular expressions in ABAP. Approach to HTML processing with regex» —&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;A _jive_internal="true" href="https://answers.sap.com/community/abap/blog/2012/10/08/a-regular-expression-regex-approach-to-html-processing" title="http://scn.sap.com/community/abap/blog/2012/10/08/a-regular-expression-regex-approach-to-html-processing"&gt;http://scn.sap.com/community/abap/blog/2012/10/08/a-regular-expression-regex-approach-to-html-processing&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 08 Oct 2012 03:38:46 GMT</pubDate>
      <guid>https://community.sap.com/t5/application-development-and-automation-discussions/select-text-between-two-tags/m-p/8992339#M1697329</guid>
      <dc:creator>MikeB</dc:creator>
      <dc:date>2012-10-08T03:38:46Z</dc:date>
    </item>
  </channel>
</rss>

