SQL puzzler : randoms do not exist (Oracle)

Last week I've posted a first puzzler related to generating random numbers or rather selecting random rows in PostgreSQL, believing that it's first and maybe the last puzzler for a long time. Well, today I've found a similar feature in Oracle - so lets begin!

The task is quite similar to the PostgreSQL one - given a single-column table, containing a continuous sequence of integers between 1 and N (in our case N = 10). So we may create and fill the table like this, for example:

-- create the table
CREATE TABLE my_table (id INTEGER);

-- fill it with 10 rows
BEGIN
    FOR i IN 1..10 LOOP
        INSERT INTO my_table (id) VALUES (i);
    END LOOP;
END;

Consider the following - quite simple query:

SELECT (CASE WHEN (id = random_value) THEN 'EQUAL'
             ELSE 'NOT EQUAL' END) AS result FROM (

     -- inline view
     SELECT id, ROUND(DBMS_RANDOM.value(1,10)) AS random_value
     FROM my_table

) foo WHERE id = random_value;

Question: What result does this query produce? (when run on Oracle 10g)

I'll give you some possible choices again:

  1. Random number of rows, all with the string "EQUAL"..
  2. Random number of rows, all with the string "NOT EQUAL"..
  3. Random number of rows, but it's not possible to say what they contain.

As usual the answer is hidden - just click on the "Answer" and it will be shown.

Answer

As we all know, the correct answer is (c). The number of rows is random (random value with binomial distribution), but the problem is with the outer part of the query - particularly with evaluation of "id = random_value" comparison, which is performed within the CASE construct.

The way the query is written is a little bit misleading (it wouldn't be a puzzler) as the "common sense" says that the inline view returns only rows with "id = random_value", so the condition in CASE has to be evaluated as true.

Truth is Oracle (10g) does not work like that - random value generated within the inline view is not used when evaluating CASE condition, but the expression "ROUND(DBMS_RANDOM.value(1,10))" is evaluated again. Due to this a new random value will be returned - probably not equal to the previous one. Thus the CASE will randomly return "EQUAL" and "NOT EQUAL".

I did not have a chance to test this on Oracle 11g, but according to the information I've received this puzzler will be processed differently - according to the common sense. So the number of rows will be random, but the expression won't be evaluated again - all the rows will contain "EQUAL".

How to "fix" this on 10g? You may use "materialize" hint, so that the inline view will be "materialized" and won't be re-evaluated in the next step - this is virtually identical to the solution 11g does automatically:

WITH t AS (SELECT /*+ materialize */ my_table.*,
                  ROUND(DBMS_RANDOM.value(1, 10)) as random_value
           FROM my_table)
SELECT (CASE WHEN (id = random_value) THEN 'ROVNOST'
             ELSE 'NEROVNOST' END) AS vysledek FROM t
WHERE id = random_value

And now some fun with probabilities ...

Matematical fun

As alteady mentioned in the previous puzzler, the number of returned rows is govened by the binomial distribution, i.e. the probability that given table of n rows (10 in our case), k rows will be returned, is

Pn,k = n! / ((n-k)! * k!) * pk * (1-p)n-k

where p is the probability that a particular row will be returned independently from the others (in our case p = 10-1).

But what values will be in those returned rows? As already said, the values will be random - Oracle re-evaluates the expression and thus generates a new random values. Given a fixed number of returned rows k and the number of rows containing value "EQUAL" denote as l - logically the inequality k >= l holds. Probability that a given row contains "EQUAL" value is 1/n = 10-1 again, so let's use the name p.

Then the number of rows with "EQUAL" value is governed by the binomial distribution - just replace n and k, resp. k and l, i.e.

Pk,l = k! / ((k-l)! * l!) * pl * (1-p)k-l

So the probability that from a table of n rows exactly k rows will be returned, and that l of them will contain "EQUAL" value is a product of the two probabilities

Pn,k,l = Pn,k * Pk,l = n! / ((n-k)! * (k-l)! * l!) * pk+l * (1-p)n-l

This for example for n = 10 gives the following table (whereas 0.00% denotes very small although not zero probabilities):

k / l 0 1 2 3 4 5 6 7 8 9 10
0 34,87%
                   
1 34.87% 3.87%                  
2 15.69% 3.49% 0.19%                
3 4.18% 1.39% 0.15% 0.01%              
4 0.73% 0.33% 0.05% 0.00% 0.00%            
5 0.09% 0.05% 0.01% 0.00% 0.00% 0.00%          
6 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%        
7 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%      
8 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%    
9 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%  
10 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%

 

Comments

There are no comments for this article (or are awaiting acceptance).

New comment

All the comments have to be accepted, so there may be some delay between submitting and accepting (or rejecting) the comment. If you enter the e-mail address, you will be informed about acceptance or rejection.

Subject or body may not contain HTML tags - they will be automatically removed. Paragraphs may be separated using a newline (ENTER).

(optional)