‎2009 Oct 02 11:31 PM
Hi experts!
I have to find the zip code (for germany) in a given text. I found 5 digits in my text, but my problem is to ignore numbers which consists of more than 5 digits!
My first try works for all cases but not for the last one.
FIND FIRST OCCURRENCE OF REGEX '([0-9]{5})' IN ld_string SUBMATCHES ld_plz.D-12345 Mainz -> should match 12345
D 12345 Mainz -> should match 12345
12345 Mainz -> should match 12345
12345Mainz -> should match 12345
Mainz D-12345 -> should match 12345
D-123 45 Mainz -> error because of the space between the numbers
D-12333345 Mainz -> error because only 5 digits are valid for a germany zip code; my REGEX does not work!
Thanks a lot!
Regards,
Florian
‎2009 Oct 02 11:37 PM
D-123 45 Mainz -> error because of the space between the numbers
find ` ` in LD_STRING. " ` ` is back quotes (quote beside 1 in keyboard)
if sy-subrc = 0
message 'error, space not allowed'.
endifand for this
D-12333345 Mainz -> error because only 5 digits are valid for a germany zip code;
i guess your code is correct. but you can try removing the ( )s
find regex '[0-9]{5}' in ld_string.
‎2009 Oct 03 12:29 AM
Where's the question?
I think the introduction of REGEX in ABAP is revolutionray!
Regards,
Clemens
‎2009 Oct 03 12:43 AM
Hi Florian check this :
FIND FIRST OCCURRENCE OF REGEX '^([0-9]{5})$' IN ld_string SUBMATCHES ld_plz.
‎2009 Oct 03 8:29 AM
Hi J@Y!
That's what I tried first. But unfortunately it does not seem to work. I tested it with DEMO_REGEX_TOY and it does not match in any case!? I wonder why!? I think that ^ and $ stands the start and the end of the hole content in the variable LD_STRING. It only matches 12345.
Regards,
Florian
‎2009 Oct 03 4:29 PM
Hi Florian,
curiosity persists.
After some playing around I reduced the pattern using Placeholder for any single digit \d. Then I noticed that the 5-digit-sequence will also match 5 digits out of 6, so i used \D Placeholder for any character other than a digit Then I don't know how to recognize (optional) line start or end as alternative to non-digit, so I just enclose the string to be checked into spaces - please suggest a more elegant solution.
My test form
FORM regex .
DATA:
lv_subm type string,
lt_string TYPE TABLE OF string.
FIELD-SYMBOLS:
<string> TYPE string.
APPEND:
'D-12345 Mainz' TO lt_string,
'D 12345 Mainz' TO lt_string,
'12345 Mainz' TO lt_string,
'12345Mainz' TO lt_string,
'123456Mainz' TO lt_string,
'123 45Mainz' TO lt_string,
'Mainz D-12345' TO lt_string.
LOOP AT lt_string ASSIGNING <string>.
clear:
lv_subm.
CONCATENATE ` ` <string> ` ` into <string>.
FIND REGEX '\D(\d{5})\D' IN <string> SUBMATCHES lv_subm.
WRITE: / <string>, 20 'matches', 30 lv_subm,40 'SY-SUBRC=', sy-subrc.
ENDLOOP.
ENDFORM. " REGEXcreates this output:
D-12345 Mainz matches 12345 SY-SUBRC= 0
D 12345 Mainz matches 12345 SY-SUBRC= 0
12345 Mainz matches 12345 SY-SUBRC= 0
12345Mainz matches 12345 SY-SUBRC= 0
123456Mainz matches SY-SUBRC= 4
123 45Mainz matches SY-SUBRC= 4
Mainz D-12345 matches 12345 SY-SUBRC= 0As I do not fully understand the meaning of FIRST OCCURRENCE, I just removed it.
Regards,
Clemens
‎2009 Oct 06 6:53 AM
Do not try to put everything in one regex, it makes them non performant, hard to read and hart to maintain.
As a first approach i would add word boundaries:
FIND FIRST OCCURRENCE OF REGEX '(\<[0-9]{5}\>)' IN ld_string SUBMATCHES ld_plz.
This fixes most of your examples but 12345Mainz not because the numbers do not have a word boundary.
If the regex above wouldfail, i would try another regex which fits the last remaining exmaple, maybe in alist with user approval.