on 2014 Mar 21 10:47 AM
I have one column that have the following values.
a,b,c,d,e,f,g,h,i,j 1,2,3,4,5,6,7,8,9,10 r4,32,13,49,k1,67,10,56,12,11
I need to return 7th element for those rows
the result should be:
1st row: g 2nd row: 7 3rd row: 10 ...
but I found it hard to do in sybase IQ/Sybase ASE. I don't think sybase supports regular expression. What is the best way to handle the situation?
thanks
Request clarification before answering.
You could also use OPENSTRING to advantage here. In my testing with ~10K rows, it executed in 0.145s compared to 3.313s for the regexp_substr approach.
drop table if exists T_S; create table T_S(x int, str long varchar); insert into T_S(x,str) select row_num / 10 x, list(row_num) as str from sa_rowgenerator(0,100000) group by x order by 1; commit;
Then to use the OPENSTRING:
select T_S.*, c7 from T_S cross apply openstring( value T_S.str ) with ( c1 varchar(255) , c2 varchar(255) , c3 varchar(255) , c4 varchar(255) , c5 varchar(255) , c6 varchar(255) , c7 varchar(255) ) D order by x
The above formulation relies on us getting exactly the 7th column (known at query time) and if there are additional columns beyond 7 they are just ignored. If you don't know ahead of time how many columns there are or which column you want to retrieve, you could instruct OPENSTRING that rows are delimited by ',' (and that columns are delimited by some out of band character), then use ROWID(D) to select which element from the list you want.
I find that OPENSTRING generally performs better than sa_split_list but it is not a drop in replacement in all cases.
You could still wrap the above in a UDF if you like but you would lose some of the performance benefits -- even if inlined, it would be a subquery expression instead of a join. If you write a procedure carefully you could use it in the FROM clause with CROSS APPLY or OUTER APPLY and expect it to be inlined.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Try using sa_split_list(). E.g.
select row_value from sa_split_list( 'a,b,c,d,e,f,g,h,i,j' ) where line_num = 7
returns 'g'.
You can wrap this in a UDF like this:
create function nth_item_in_list( in list long varchar, in num int ) returns long varchar begin declare @selected long varchar; select row_value into @selected from sa_split_list( list ) where line_num = num; return @selected; end;
and then you can use:
select nth_item_in_list( mytablecol, 7 ) from mytable;
HTH
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
While I would prefer Mark's solution with sa_split_list, in case you want to use a regex, the following should do:
select val_list, regexp_substr(value_list, '(?<=([^,]*,){6})[^,]*') as seventh_element from (select list(row_num) as value_list from sa_rowgenerator(-10, 30, 4)) as test
The sa_rowgenerator will build a list of each 4th integer from -10 to 30, and the (rather funky) pattern will search for a string that is preceded with a sixfold group of text delimited by a comma, where the pattern is limited by the next comma.
The result set is:
value_list, seventh_element
'-10,-6,-2,2,6,10,14,18,22,26,30','14'
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks Volker for posting the regexp solution - I figured there was a way of doing it using regexp_substr!
I was curious to know how regexp_substr performed as compared to my sa_split_list solution and it would appear that using regexp_substr is a winner by a long shot.
I tested the performance of each using the following sequence on a recent Linux SQLA v16 production build:
create table foo( i int default autoincrement, s long varchar ); create function random_list( in num_el int ) returns long varchar not deterministic begin declare @list long varchar; select list( substr( 'abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ', cast( rand() * 62 as int ), 1 ) ) into @list from sa_rowgenerator( 1, num_el, 1 ) ; return @list; end; insert into foo( s ) select random_list( 10 ) from sa_rowgenerator( 1, 10000, 1 ); commit; create function nth_item_in_list( in list long varchar, in num int ) returns long varchar begin declare @selected long varchar; select row_value into @selected from sa_split_list( list ) where line_num = num; return @selected; end; create function nth_item_in_list_re( in list long varchar, in num int ) returns long varchar begin declare @selected long varchar; set @selected = regexp_substr( list, '(?<=([^,]*,){' || (num-1) || '})[^,]*' ); return @selected; end; create function nth_item_in_list_re2( in list long varchar, in num int ) returns long varchar begin declare @selected long varchar; set @selected = regexp_substr( list, '(?<=([^,]*,){' || (num-1) || '})[^,]*' ); set @selected = '' || @selected || ''; -- stop inlining of this function return @selected; end; commit; begin declare @tlen int; declare @t1 timestamp; declare @t2 timestamp; set @t1 = now(*); select sum( length( -- pick one of the following: nth_item_in_list( foo.s, 7 ) -- regexp_substr( foo.s, '(?<=([^,]*,){6})[^,]*' ) -- nth_item_in_list_re( foo.s, 7 ) -- nth_item_in_list_re2( foo.s, 7 ) )) into @tlen from foo; set @t2 = now(*); select @t1, @t2, datediff( millisecond, @t1, @t2 ) as millisec, @tlen end;
Using regexp_substr() or nth_item_in_list_re() gave roughly the same performance - which is what I expected since the UDF will be inlined - and in my testing is 100 times faster!! Nice!
Using the nth_item_in_list_re2() function (which uses regexp_substr but will not be inlined) still performed roughly 3 times faster than the sa_split_list solution.
The difference in performance between the '_re' and '_re2' test runs gives an indication of the cost of calling a UDF 10000 times and the benefits of inlining whereever possible.
Thanks guys!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
52 | |
6 | |
5 | |
5 | |
5 | |
4 | |
3 | |
3 | |
3 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.