‎2011 Dec 21 11:31 AM
Hi Experts,
I have one problem in finding a repeated pattern in a given string.
For example: if the given string lv_string = '345723345982343452343345'.
you can check the lv_string, which contains '345' repeatedly in it.
so, i want a ABAP code to get list of all such type of patterns repeated in a given string.
Thanks in advance.
Venky.
‎2011 Dec 21 12:06 PM
Vishnu,
The code is readily available in F1 help FIND - pattern
‎2011 Dec 21 11:38 AM
‎2011 Dec 21 11:42 AM
‎2011 Dec 21 11:50 AM
‎2011 Dec 21 11:57 AM
Hello,
its not just CP / CA / NP / CO... etc.
And also am not talking about regular expressions which we normally do by using cl_abap_matcher or cl_abap_regex.
Try to understand the problem first:
NOTE: you are given just a string and nothing else and you will not be given any search key/ pattern to be found.
we have to search the given string so that, what part of string is repeated again and again.
I hope this is enough to understand the problem.
thanks
Venky.
‎2011 Dec 21 11:59 AM
To be fair to the OP, the question is not trivial.
I challenge everyone here to write code to detect a string with patterns repeated more than once in the string. The pattern can be anything and is not known at design time.
Hint: Regular expressions may be a key and normal ABAP pattern matching is not much use
There are some programming snippets on the Web, though not in ABAP but still may be useful
http://social.msdn.microsoft.com/Forums/en/csharpgeneral/thread/037047fc-5506-4656-ad27-dab9a6c501ee
Edited by: Vishnu Tallapragada on Dec 21, 2011 1:01 PM
I just realize, the OP and I posted together
‎2011 Dec 21 12:06 PM
data: lv_string type string.
data: mcnt type i.
lv_string = '345723345982343452343345'.
find ALL OCCURRENCES OF '345' in lv_string MATCH COUNT mcnt.
write: / mcnt.Did i win cookies?
‎2011 Dec 21 12:10 PM
Hello Maen Anachronos ,
Please try to understand the problem..
I have already told that, you will be given just a big string like this '345723345982343452343345'.
And you have to find out what part of the string is repeated again and again in the given string.
and if there are multiple such parts of the string, which are repeated in a given string.
all that repeated substrings should be the output...
Interesting?....
Thanks,
Venky.
‎2011 Dec 21 12:11 PM
Using the FIND statement it should be easy to program the stuff. Even if the serach string is not given.
‎2011 Dec 21 12:06 PM
Vishnu,
The code is readily available in F1 help FIND - pattern
‎2011 Dec 21 12:09 PM
Keshav and Maen - sorry but you still don't seem to get it.
We don't know what the pattern is.
We want to detect a string with repeated patterns, the pattern we don't know in advance.
It can be anything
345w2343234523
Vishnudoesn'tlikeVishnu
‎2011 Dec 21 12:12 PM
Ow but i do understand.
Point is: you need to be creative to use the FIND statement.
‎2011 Dec 21 12:14 PM
Vishnu,
See this example in documentation
DATA: patt TYPE string VALUE `now`,
text TYPE string,
result_tab TYPE match_result_tab.
FIELD-SYMBOLS <match> LIKE LINE OF result_tab.
FIND ALL OCCURRENCES OF patt IN
`Everybody knows this is nowhere`
RESULTS result_tab.
LOOP AT result_tab ASSIGNING <match>.
WRITE: / <match>-offset, <match>-length.
ENDLOOP.
Mods-Sorry for pasting the standard code, did just in case to demonstrate that Its as simple as that
@OP- No Points please
‎2011 Dec 21 12:18 PM
data: gv_string type string.
data: gv_length type i.
data: gv_offset type i.
data: gv_search type string.
data: mcnt type i.
gv_string = '345723345982343452343345'.
gv_length = strlen( gv_string ).
write: / gv_string.
gv_offset = 0.
do gv_length times.
if gv_offset eq 0.
gv_search = gv_string(1).
else.
concatenate gv_search gv_string+gv_offset(1) into gv_search.
endif.
find ALL OCCURRENCES OF gv_search in gv_string MATCH COUNT mcnt.
write: /'Search for', gv_search, 'counted', mcnt.
gv_offset = gv_offset + 1.
enddo.Now i expect a really big cake instead of a cookie.
‎2011 Dec 21 12:20 PM
‎2011 Dec 21 12:21 PM
I have provided that standard code...Did anybody look into it
‎2011 Dec 21 12:21 PM
hehe... and why not? It's not like he's going to search for repeating patterns in the holy bible.
And the challenge was: write a piece of code to do it.
‎2011 Dec 21 12:23 PM
I have provided that standard code...Did anybody look into it
Hihihi... but he doesnt know the pattern upfront. The problem is: look for a repeating pattern in a string.
‎2011 Dec 21 12:25 PM
Got it...Let me give my brain an exercise before going home ... So that i can drive accurate
‎2011 Dec 21 12:26 PM
I know, it can be done by breaking the string into tokens of all possible lengths and at all possible offsets and searching for those tokens in a string. To be honest, I was expecting some regular expression trick that can do the job as efficiently as possible )
Maen, when I look at your code again, I don't think you are searching tokens at all offsets and of all possible lengths. Will check it out later.
‎2011 Dec 21 12:32 PM
I> Maen, when I look at your code again, I don't think you are searching tokens at all offsets and of all possible lengths. Will check it out later.
That's right. I'm only starting from the beginning.
‎2011 Dec 21 12:41 PM
TYPES: BEGIN OF ty_search,
value TYPE string,
count TYPE i,
END OF ty_search.
DATA: it_search TYPE HASHED TABLE OF ty_search WITH UNIQUE KEY value.
DATA: wa_search TYPE ty_search.
DATA: gv_string TYPE string.
DATA: gv_length TYPE i.
DATA: gv_offset TYPE i.
DATA: gv_search TYPE string.
DATA: mcnt TYPE i.
gv_string = '345723345982343452343345'.
WRITE: / gv_string.
DO.
gv_length = STRLEN( gv_string ).
IF gv_length EQ 0.
EXIT.
ENDIF.
gv_offset = 0.
DO gv_length TIMES.
IF gv_offset EQ 0.
gv_search = gv_string(1).
ELSE.
CONCATENATE gv_search gv_string+gv_offset(1) INTO gv_search.
ENDIF.
READ TABLE it_search TRANSPORTING NO FIELDS WITH TABLE KEY value = gv_search.
IF sy-subrc NE 0.
FIND ALL OCCURRENCES OF gv_search IN gv_string MATCH COUNT mcnt.
IF mcnt GT 1.
wa_search-value = gv_search.
wa_search-count = mcnt.
INSERT wa_search INTO TABLE it_search.
ENDIF.
ENDIF.
gv_offset = gv_offset + 1.
ENDDO.
SHIFT gv_string LEFT BY 1 PLACES.
ENDDO.
LOOP AT it_search INTO wa_search.
WRITE: /'Search for', wa_search-value, 50 'counted', wa_search-count.
ENDLOOP.Like this then: 2 really big cakes.
‎2011 Dec 21 1:07 PM
Ah.. small mistake..
Added:
DATA: gv_string2 TYPE string.Changed
gv_string = gv_string2 = '345723345982343452343345'.Changed:
FIND ALL OCCURRENCES OF gv_search IN gv_string2 MATCH COUNT mcnt.@OP: just need to be a bit creative to find a solution.
‎2011 Dec 21 3:41 PM
This one shows the power and beauty of regular expressions.
Here, we find repeated non blank patterns in a string with 3 or more than 3 characters and lists them along with how many times they are repeated.
In the string, 'Today, as never before, the fates of men are so intimately linked to one another that a disaster for one is a disaster for everybody', the program lists the following
Count Pattern
----------------
2 ever
3 for
2 the
2 ate
2 one
2 disasterThe key ingredient of the program is regular expression
([^ ]{3,}).*(\1)
which matches two repeated words, at a time, that are 3 or more than 3 characters in length. Rest you can figure out!
DATA: ls_string TYPE string VALUE 'Today, as never before, the fates of men are so intimately linked to one another that a disaster for one is a disaster for everybody'.
DATA: regex TYPE c LENGTH 120,
offset TYPE i.
DATA: lt_result TYPE match_result_tab,
ls_result TYPE LINE OF match_result_tab,
ls_submatch TYPE LINE OF match_result-submatches,
ls_pattern TYPE string.
DATA: BEGIN OF lt_found OCCURS 0,
pattern TYPE string,
count TYPE i,
END OF lt_found.
FIELD-SYMBOLS: <ls_found> LIKE lt_found.
regex = '([^ ]{3,}).*(\1)'.
offset = 0.
TRY.
DO.
FIND ALL OCCURRENCES OF REGEX regex IN ls_string RESULTS lt_result.
IF sy-subrc = 0.
READ TABLE lt_result INDEX 1 INTO ls_result.
IF sy-subrc = 0.
READ TABLE ls_result-submatches INTO ls_submatch INDEX 1.
IF sy-subrc = 0.
ls_pattern = ls_string+ls_submatch-offset(ls_submatch-length).
READ TABLE lt_found ASSIGNING <ls_found> WITH KEY pattern = ls_pattern.
IF sy-subrc NE 0.
lt_found-count = 2.
lt_found-pattern = ls_string+ls_submatch-offset(ls_submatch-length).
APPEND lt_found.
ELSE.
ADD 1 TO <ls_found>-count.
ENDIF.
offset = ls_submatch-offset + ls_submatch-length.
ls_string = ls_string+offset.
ENDIF.
ENDIF.
ELSE.
EXIT.
ENDIF.
ENDDO.
CATCH cx_sy_regex.
MESSAGE 'Invalid regular expression' TYPE 'S' DISPLAY LIKE 'E'. "#EC NOTEXT
ENDTRY.
LOOP AT lt_found.
WRITE:/ lt_found-count, lt_found-pattern.
ENDLOOP.
‎2011 Dec 21 3:49 PM
‎2011 Dec 21 3:52 PM
You still owe me 2 really big cakes!
Hahaha.. I will once I come to your place some day!
I tested yours and it is working fine. Though, I would want to restrict only tokens above certain length like say 3.
By the way, did you execute and test the code snippet I gave?
Of course, it will only compile in ABAP 7 or above kernel.
Also just the below lines of code can check if there are repeated patterns in a string. Now try to match the power )
regex = '([^ ]{3,}).*(\1)'.
FIND ALL OCCURRENCES OF REGEX regex IN ls_string
IF sy-subrc = 0.
"Repeated patterns exist!
ENDIF.
‎2011 Dec 21 3:59 PM
Yup, works absolutely perfect! Well done!
Only fiddled around with regex 1 or 2 years ago and even only in a very simple form. And to be honest, i didn't expect this to be that fairly easy to achieve with regex; allthough i suspect it did took you some amount of trial and error to get it finally done.
Still: well done!
‎2011 Dec 21 4:06 PM
And additionally: this is the reason why i keep visiting SDN/SCN. To discover gems like this between all the .......
‎2011 Dec 21 4:13 PM
Yes, that feeling is mutual Maen!
Glad that we can all meet and learn new things!
Thanks to the OP for bringing this topic
‎2011 Dec 21 4:34 PM
Yup! Tx to Vishnu! this '([^ ]{1,}).*(\1)' will certainly be usefull someday
m.
‎2011 Dec 21 4:38 PM
‎2011 Dec 21 7:29 PM
I think we scared OP away....
Regular Expressions can scare away even the bravest
‎2011 Dec 22 3:05 AM
Hi Guys,
First of all, Thanks to all the replys to my query..
@Vishnu: you showed a hidden secret of "Regular expressions".
@Maen: you cracked hidden secret of "Regular expressions" in ur own way.
@Brendon: ur code is simple to understand for a freshers like me....!!.
@Rob: you are right..... But I know the experts will understand the problem for a solution right away!!.
Well done Guys.....
Venky.
‎2011 Dec 22 3:35 AM
> @Rob: you are right..... But I know the experts will understand the problem for a solution right away!!.
Nope - you can't understand if there are unanswered questions. I speak from experience.
Rob
‎2011 Dec 22 8:28 AM
Well...
More importantly; what is a repeated pattern? For example, in the string '2222', there are three repetitions of '2', two of '22' and one of '222' including overlaps. So do overlapping strings count or not?
A repeated pattern is a repeated pattern I mean we can always find questions to ask then... does it contain only number? only char? a mix of them? do we have to look for mirrored pattern also? etc....
The things is that here the pattern given as solution covers all cases (from 1 to n length, for any type of strings, with overlapping or not), hence the beauty of the solution!...
This should be done before coding starts - remember?
.... or we could also just read a bit between the lines...
you can't understand if there are unanswered questions. I speak from experience.
Yes we can...the proof: no need to know the answer to your question about the length for example since the solution proposed will work for any searched pattern length ;)... same for your question about overlaps...
... And I speak from experience too
cheers,
m.
‎2011 Dec 21 12:12 PM
Hi
dude just use all occurences you will find the solution
FIND ALL OCCURRENCES OF '345' in lv_string MATCH COUNT var
Cheers
NZAB
‎2011 Dec 21 12:14 PM
Hello NZAB,
Can you please go through the problem properly....?
Thanks.
Venky.
‎2011 Dec 21 12:15 PM
He doesn't know if it is 345 or Vishnu or Maen or Keshav, that can repeat in the string.
He just wants to know if a pattern (which can be anything) is repeating in a string, if so what is (are) those. Not an easy problem.
Keshav, he doesn't know if it is "now" that can repeat in his string. It can be "now" or "then" or "here" or "there", can be anything
‎2011 Dec 21 8:44 PM
DATA:
patt TYPE string ,
text TYPE string,
lv_lenght type i.
lv_key type n,
lv_match type n.
text = `345723345982343452343345`.
lv_lenght = STRLEN( text ).
do lv_lenght times.
lv_key = sy-index - 1.
DO ( lv_lenght - lv_key ) TIMES.
patt = text+lv_key(sy-index).
FIND ALL OCCURRENCES OF patt IN
text
match COUNT lv_match.
RESULTS result_tab.
if lv_match > 1.
write:/ patt, lv_match.
endif.
ENDDO.
ENDDO.
You could store each PATT once checked in table and do a read so as not to check it again
guess should have done refresh before posting
Edited by: Brendan Reid on Dec 21, 2011 9:46 PM
‎2011 Dec 21 10:58 PM
I think the first thing that should be done is to question and get clarification on the specs.
How long must the pattern be; one character, three, something else?
More importantly; what is a repeated pattern? For example, in the string '2222', there are three repetitions of '2', two of '22' and one of '222' including overlaps. So do overlapping strings count or not?
This should be done before coding starts - remember?
Rob