2012 Feb 24 9:40 AM
Dear all REGEX Gurus.
We have a comma separated string like this:
132, 143, "222, 144, abc", 227, 888, "222#55#ab"
As you might have guessed, this comes from excel when we save as CSV file, and when one of the columns in excel has the value 222,144,abc. So excel itself puts " at beginning and end.
Now we want to split it based on ',' like this:
132
143
222,144,abc
227
888
222#55#ab
So what we thought is to use a REGEX to find a pattern that has anything beginning with ", ending with ", and has a comma (,) in between. We'd replace that comma with a special character.
System seems to perform greedy search if i search like this ",".
I tried this:
"([^"]+)" -> this works that it gives me all values within quotes. But I want only those which have a comma in them.
So in above, i do not want 222#55#ab to come.
Tried various combinations, but not able to get it work.
Can someone advise please how to achieve it?
Thanks in adv.
2012 Feb 24 10:05 AM
my first idea:
SPLIT line AT ',' INTO TABLE xyz
LOOP AT XYZ
and when you find a "' at first position -> concatenate with the following lines until you find a " at the end.
ENDLOOP
2012 Feb 24 10:05 AM
my first idea:
SPLIT line AT ',' INTO TABLE xyz
LOOP AT XYZ
and when you find a "' at first position -> concatenate with the following lines until you find a " at the end.
ENDLOOP
2012 Feb 24 10:19 AM
Yes, we already tried that.
It has a problem that if there is an opening ", but no closing ", then our search becomes useless.
Moreover it slows down performance as our data can have many rows.
Hence I thought REGEX would be a faster and simpler way.
2012 Feb 24 10:42 AM
Hi,
If " " will give atleast one , then give regex with + .
Well i didn tried it. Just give a try.
Please check the use of +(page 4) in [this link|http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/03a52be5-0901-0010-9da4-e9d5f8c5ce1c?QuickLink=index&overridelayout=true].
2012 Feb 24 11:48 AM
Hi,
I am in 4.6 version and dont have the facility to regex it for you.
Just wrote a sample code, if useful then use it.
DATA:lv_string(100) TYPE c.
DATA:lv_len TYPE i.
TYPES:BEGIN OF ty,
field(100) TYPE c,
END OF ty.
DATA:it TYPE TABLE OF ty.
FIELD-SYMBOLS:<fs> TYPE ty.
lv_string = '132, 143, "222, 144, abc", 227, 888, "222#55#ab"'.
CONDENSE lv_string NO-GAPS.
SPLIT lv_string AT '"' INTO TABLE it.
CLEAR lv_string.
LOOP AT it ASSIGNING <fs>.
IF <fs> IS INITIAL.
delete it index sy-tabix.
CONTINUE.
ENDIF.
lv_len = strlen( <fs> ) - 1.
IF ( <fs>+lv_len(1) CA '"' ) OR ( <fs>+lv_len(1) CA ',' ).
<fs>+lv_len(1) = ' '.
ENDIF.
IF ( <fs>+0(1) CA '"' ) OR ( <fs>+0(1) CA ',' ).
<fs>+0(1) = ' '.
ENDIF.
CONDENSE <fs>.
WRITE:/ <fs>.
ENDLOOP.
2012 Mar 01 1:37 AM
Thanks Keshav and everyone else.
But I was really looking for a REGEX to achieve the same task as I feel a good REGEX will be much faster than all this logic.
I have currently also written a logic to overcome this problem, but earlier I have used REGEX to find file names and extensions and I feel it is very powerful and wonderful.
Any advices on what regex can make it work for me here? Thanks again in adv