cancel
Showing results for 
Search instead for 
Did you mean: 

using rand over a subset of rows

Former Member
0 Kudos
2,086

I'm attempting to get a random, based on a range that spans 1 to the maximum number of rows that for a subset. I run the query in Oracle successfully and get a different randomly selected number each time and only a single number, which is what I am expecting,

but when I run the same query, albeit the random function is different, I either observe no result, a single row or two rows, for example:


$ dbisql -c "uid=sadb;pwd=sadb;eng=ftnode_sa;links=tcpip(host=10.11.12.17;port=49153);LOG=conn.log;dbn=ftnode_sa" "select c_id from (
 select c_id,row_number() over (order by c_d_id) as rn, count(*) over() max_rn
 from customer where c_d_id=5
) t
where rn = (select floor(rand()*(max_rn))+1)"
c_id 
----
1308

(1 rows)

$ dbisql -c "uid=sadb;pwd=sadb;eng=ftnode_sa;links=tcpip(host=10.11.12.17;port=49153);LOG=conn.log;dbn=ftnode_sa" "select c_id from (
 select c_id,row_number() over (order by c_d_id) as rn, count(*) over() max_rn
 from customer where c_d_id=5
) t
where rn = (select floor(rand()*(max_rn))+1)"
c_id 
----

(0 rows)

$ dbisql -c "uid=sadb;pwd=sadb;eng=ftnode_sa;links=tcpip(host=10.11.12.17;port=49153);LOG=conn.log;dbn=ftnode_sa" "select c_id from (
 select c_id,row_number() over (order by c_d_id) as rn, count(*) over() max_rn
 from customer where c_d_id=5
) t
where rn = (select floor(rand()*(max_rn))+1)"
c_id 
----
1854 
2600 

(2 rows)

and this is the Oracle behaviour:


SQL> select c_id
from (
  select c_id, row_number() over (order by c_d_id) as rn, count(*) over() max_rn
  from customer
  where c_d_id=:d_id
)
where
rn = (select floor(dbms_random.value(1,max_rn+1)) from dual)  2    3    4    5    6    7    8  ;
C_ID


2938

SQL> select c_id from ( select c_id, row_number() over (order by c_d_id) as rn, count(*) over() max_rn from customer where c_d_id=:d_id ) where rn = (select floor(dbms_random.value(1,max_rn+1)) from dual) 2 3 4 5 6 7 8
9 ; C_ID


2204 SQL> select c_id from ( select c_id, row_number() over (order by c_d_id) as rn, count(*) over() max_rn from customer where c_d_id=:d_id ) where rn = (select floor(dbms_random.value(1,max_rn+1)) from dual) 2 3 4 5 6 7 8
9 ; C_ID


2265

Can somebody help with formulating a SQL statement that would behave as how the existing SQL statement does in Oracle?

Former Member
0 Kudos

Just thought I'd post a quick update. The query in my first post does not work in PostgreSQL either, works fine in Oracle however.

The following query works in PostgreSQL and Oracle however:


ft_node=# with rand as (select random() as r)
ft_node-# select c_id 
ft_node-# from    ( 
ft_node(#          select c_id, row_number() over (order by c_d_id) as rn 
ft_node(#               ,  count() over() max_rn 
ft_node(#          from customer where c_d_id=5 
ft_node(#         ) t 
ft_node-# where rn = (select floor(r(max_rn))+1 from rand); 
 c_id 


3016 (1 row) albeit with different function calls. The difference being the inclusion of a WITH subquery. The same method however still does not work in SQL Anywhere.


with rand as (select rand() as r) 
select c_id 
from    ( 
         select c_id, row_number() over (order by c_d_id) as rn 
              ,  count() ovaer() max_rn 
         from customer where c_d_id=5 
        ) t 
where rn = (select floor(r(max_rn))+1 from rand); 

Accepted Solutions (1)

Accepted Solutions (1)

VolkerBarth
Contributor

I'm not sure whether RAND() is reliable in subqueries or CTEs, cf. that unanswered FAQ...

Answers (1)

Answers (1)

Breck_Carter
Participant
0 Kudos

Please tell us, in English, what you are expecting this query to return.

Also, please show us the c_id and c_d_id values in the customer table.

As far as I can tell, the query correctly returns a varying number of rows depending on the values in the customer table

Note that the word "correctly" implies "as the code is written", which is not necessarily the same as "what you want" :)...

SELECT rand ( 1 );

CREATE TABLE customer ( 
   c_id   INTEGER,
   c_d_id INTEGER );

INSERT customer VALUES ( 1, 3 );
INSERT customer VALUES ( 2, 3 );
INSERT customer VALUES ( 3, 3 );
INSERT customer VALUES ( 4, 5 );
INSERT customer VALUES ( 5, 5 );
INSERT customer VALUES ( 6, 3 );
COMMIT;

This subquery returns a fixed result set consisting of 2 rows, and max_rn is always 2...

select customer.c_id,
                row_number() over ( order by customer.c_d_id ) AS rn, 
                count(*) over()                                AS max_rn
           from customer 
           where customer.c_d_id = 5

c_id,rn,max_rn
4,1,2
5,2,2

This subquery returns a single row, either 1 or 2...

select floor ( rand() * 2 ) + 1

floor(rand()*2)+1
1.0

floor(rand()*2)+1
2.0

The main query returns 2, 1 or zero rows depending on whether the rn values (1,2) match the random subquery values (1,1), (1,2), (2,1) or (2,2).

select t.c_id, t.rn, t.max_rn 
  from ( select customer.c_id,
                row_number() over ( order by customer.c_d_id ) AS rn, 
                count(*) over()                                AS max_rn
           from customer 
           where customer.c_d_id = 5
       ) AS t
where t.rn = ( select floor ( rand() * t.max_rn ) + 1 );

c_id
4
5

c_id
5

c_id

Perhaps the other databases return different results because the customer data is different.

Former Member
0 Kudos

Each client id is unqiue, for example:


select c_id,c_d_id,c_w_id,c_first from customer where c_d_id=2
c_id,c_d_id,c_w_id,c_first
1301,2,1,'Parkway Company'
1306,2,1,'Beutelschies & C'
1339,2,1,'Sigma Corp Of Am'
1356,2,1,'Family Life Radi'
1364,2,1,'Russell G Smith '
1366,2,1,'Thornton Ferrara'
1374,3,1,'Kinkos Copies'
1391,3,1,'Natl Cncl Stat B'
1409,3,1,'Calvalier Intrnt'

I understand the problem. The rand() is volatile, and is executed for each row. So if the subselect (from my first post) has 100 rows, you get 100 random values. So sometimes one row matches, sometime none, sometime more than one. So the the query in my 2nd post, which uses a subquery WITH, should only ever return a single row, which it does for Oracle and PostgreSQL.


with rand as (select rand() as r)
select c_id
from    (
 select c_id, c_d_id,row_number() over (order by c_d_id) as rn,count() over() max_rn
 from cust_test where c_d_id=3
) t
where rn = (select floor(r(max_rn))+1 from rand)

Breck_Carter
Participant
0 Kudos

> the query in my 2nd post...

...is syntactically incorrect, even after "ovaer()" is fixed to "over()":

with rand as (select rand() as r) 
select c_id 
from    ( 
         select c_id, row_number() over (order by c_d_id) as rn 
              ,  count() ovaer() max_rn 
         from customer where c_d_id=5 
        ) t 
where rn = (select floor(r(max_rn))+1 from rand);

Syntax error near '(' on line 5
SQLCODE=-131, ODBC 3 State="42000"
Line 1, column 1

with rand as (select rand() as r) 
select c_id 
from    ( 
         select c_id, row_number() over (order by c_d_id) as rn 
              ,  count() over() max_rn 
         from customer where c_d_id=5 
        ) t 
where rn = (select floor(r(max_rn))+1 from rand); 

Procedure 'r' not found
SQLCODE=-265, ODBC 3 State="42S02"
Line 1, column 1
Breck_Carter
Participant
0 Kudos

> floor(r(max_rn))+1

What is the intent of that expression?

Do you have a "CREATE FUNCTION r" in Oracle?

with rand as (select rand() as r)
select c_id
from    (
 select c_id, c_d_id,row_number() over (order by c_d_id) as rn,count() over() max_rn
 from cust_test where c_d_id=3
) t
where rn = (select floor(r(max_rn))+1 from rand)

Procedure 'r' not found
SQLCODE=-265, ODBC 3 State="42S02"
Line 12, column 1
(Continuing after error)