I am trying to generate random numbers in a range of 1 – 25 for 7 columns and the last column needs to have a range of 1 – 10
This is how I’m generating my random numbers.
SELECT ABS(CHECKSUM(NEWID()))%25 + 1 AS Number1
,ABS(CHECKSUM(NEWID()))%25 + 1 AS Number2
,ABS(CHECKSUM(NEWID()))%25 + 1 AS Number3
,ABS(CHECKSUM(NEWID()))%25 + 1 AS Number4
,ABS(CHECKSUM(NEWID()))%25 + 1 AS Number5
,ABS(CHECKSUM(NEWID()))%25 + 1 AS Number6
,ABS(CHECKSUM(NEWID()))%25 + 1 AS Number7
,ABS(CHECKSUM(NEWID()))%10 + 1 AS Number8 FROM GENERATE_SERIES(1, 1000000);
I was thinking to put the generation in a cursor so that after it generates 1 number set it will go through checks like
eg.
are any of the numbers repeating?
are any number sets the same?
and if it passes the check it will insert it into the table
also column 8 can have the same number that appears in columns 1 – 7 but columns 1 – 7 can’t have the same number in that same number set.
The only issue I am having is how to incorporate the checks into the generation of the number sets, I assumed a cursor will be the best as it will run through the cursor each time for each number set that is generated and thus when it has generated all possible number sets it will just end.
10
As I understand your question, you need to randomly generate distinct Number1
.. Number7
values from the range 1..25, while the Number8
selection is independent.
You can do this by using a CROSS APPLY
to:
- Generate the values 1..25.
- Associate random values to each.
- Use the
ROW_NUMBER()
window function to define the permutated order. - Use conditional aggregation to select the final values in your result.
Something like:
SELECT
S.Value AS Id,
-- Number1..Number7 have mutually exclusive values
MAX(CASE WHEN N.RowNum = 1 THEN N.Number END) AS Number1,
MAX(CASE WHEN N.RowNum = 2 THEN N.Number END) AS Number2,
MAX(CASE WHEN N.RowNum = 3 THEN N.Number END) AS Number3,
MAX(CASE WHEN N.RowNum = 4 THEN N.Number END) AS Number4,
MAX(CASE WHEN N.RowNum = 5 THEN N.Number END) AS Number5,
MAX(CASE WHEN N.RowNum = 6 THEN N.Number END) AS Number6,
MAX(CASE WHEN N.RowNum = 7 THEN N.Number END) AS Number7,
-- Number8 is independent
ABS(CHECKSUM(NEWID()) % 10) + 1 AS Number8
FROM GENERATE_SERIES(1, 100) S
CROSS APPLY (
-- Use a randomizer to effectively permutate the numbers 1..25
SELECT
S2.value AS Number,
ROW_NUMBER() OVER(ORDER BY R.Randomizer) AS RowNum
FROM GENERATE_SERIES(1, 25) S2
CROSS APPLY(
--SELECT CHECKSUM(NEWID(), S.value, S2.value) AS Randomizer
SELECT HASHBYTES('MD5', CONCAT(NEWID(), S.value, S2.value)) AS Randomizer
) R
) N
GROUP BY S.Value
ORDER BY S.Value
I use the HASHBYTES()
function above to generate a randomized based on NEWID()
. The S.value
and S2.value
values have been included to avoid duplicate hash values for cases where SQL server sometimes applies the same NEWID()
value multiple times in a query. (I have yet to find a clear definition of when this does or does not occur.)
I am not sure how much computational overhead HASHBYTES()
adds, but the CHECKSUM()
might also be a less-costly option at the cost of possible “less random” results. Although we really don’t care about cryptographically secure randomness here, the 'MD5'
hash algorithm selector could also be changed if needed.
Sample results (first 10 rows):
Id | Number1 | Number2 | Number3 | Number4 | Number5 | Number6 | Number7 | Number8 |
---|---|---|---|---|---|---|---|---|
1 | 1 | 24 | 15 | 6 | 3 | 19 | 23 | 5 |
2 | 13 | 10 | 15 | 11 | 22 | 21 | 9 | 1 |
3 | 22 | 4 | 13 | 15 | 23 | 20 | 9 | 1 |
4 | 5 | 14 | 23 | 25 | 13 | 10 | 12 | 10 |
5 | 6 | 5 | 1 | 9 | 17 | 14 | 7 | 6 |
6 | 5 | 25 | 10 | 3 | 13 | 12 | 7 | 5 |
7 | 2 | 25 | 19 | 20 | 1 | 14 | 8 | 9 |
8 | 1 | 9 | 13 | 22 | 24 | 17 | 7 | 10 |
9 | 1 | 18 | 23 | 19 | 10 | 15 | 7 | 2 |
10 | 2 | 9 | 12 | 1 | 5 | 14 | 19 | 7 |
Note that the Number8
column in rows 4, 5 and 6 duplicates an earlier value in the same row, but the values in columns Number1
through Number7
are distinct within each row.
See this db<>fiddle for a demo.
This will get you 7 distinct numbers between 1 and 25, and an 8th number between 1 and 10 not equal to the first 7 numbers.
SELECT TOP(100) x3.*, N8 = (
SELECT TOP(1) r.N
FROM core.rangeX(1,10,1) AS r
WHERE r.N NOT IN (N1,N2,N3,N4,N5,N6,N7)
ORDER BY NEWID()
)
FROM
(
SELECT
N1 = MAX(IIF(x2.RN=1,x2.N,NULL)),
N2 = MAX(IIF(x2.RN=2,x2.N,NULL)),
N3 = MAX(IIF(x2.RN=3,x2.N,NULL)),
N4 = MAX(IIF(x2.RN=4,x2.N,NULL)),
N5 = MAX(IIF(x2.RN=5,x2.N,NULL)),
N6 = MAX(IIF(x2.RN=6,x2.N,NULL)),
N7 = MAX(IIF(x2.RN=7,x2.N,NULL))
FROM
(
SELECT RN = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), x1.N
FROM
(
SELECT TOP(7) r.N
FROM core.rangeX(1,25,1) AS r
ORDER BY NEWID()
) AS x1
) AS x2
) x3;
This returns:
You can use a loop run this for each required row. Here’s the final solution:
DECLARE @t TABLE (N1 TINYINT,N2 TINYINT,N3 TINYINT,N4 TINYINT,
N5 TINYINT,N6 TINYINT,N7 TINYINT,N8 TINYINT)
DECLARE @i INT = 1;
WHILE @i <= 5
BEGIN
INSERT @t
SELECT TOP(100) x3.*, N8 = (
SELECT TOP(1) r.N
FROM core.rangeX(1,10,1) AS r
WHERE r.N NOT IN (N1,N2,N3,N4,N5,N6,N7)
ORDER BY NEWID()
)
FROM
(
SELECT
N1 = MAX(IIF(x2.RN=1,x2.N,NULL)),
N2 = MAX(IIF(x2.RN=2,x2.N,NULL)),
N3 = MAX(IIF(x2.RN=3,x2.N,NULL)),
N4 = MAX(IIF(x2.RN=4,x2.N,NULL)),
N5 = MAX(IIF(x2.RN=5,x2.N,NULL)),
N6 = MAX(IIF(x2.RN=6,x2.N,NULL)),
N7 = MAX(IIF(x2.RN=7,x2.N,NULL))
FROM
(
SELECT RN = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), x1.N
FROM
(
SELECT TOP(7) r.N
FROM core.rangeX(1,25,1) AS r
ORDER BY NEWID()
) AS x1
) AS x2
) x3;
SET @i+=1;
END
SELECT * FROM @t;
Returns:
NOTE: I am on a 2017 and don’t have GENERATE_SERIES so I used my own function called RangeX (below).
CREATE FUNCTION core.rangeX
(
@Low BIGINT,
@High BIGINT,
@Gap BIGINT
)
/****************************************************************************************
[Purpose]:
Creates a lazy, in-memory, forward-ordered sequence of up to 1,073,741,824 integers
beginning with @Low and ending with @High (inclusive). RangeX is a pure, 100% set-based
alternative to solving SQL problems using iterative methods such as loops, cursors and
recursive CTEs. RangeX is based on Itzik Ben-Gan's getnums function for producing a
sequence of integers and uses logic from Jeff Moden's fnTally function which includes a
parameter for determining if the "row-number" (RN) should begin with 0 or 1.
I used the name "RangeX" because it functions and performs almost identically to
the Range function built into Python and Clojure. RANGE is a reserved SQL keyword so I
went with "RangeX". Functions/Algorithms developed using RangeX can be easilty ported
over to Python, Clojure or any other programming language that leverages a lazy sequence.
The two major differences between RangeX and the Python/Clojure versions are:
1. RangeX is *Inclusive* where the other two are *Exclusive". range(0,3) in Python and
Clojure return [0 1 2], core.RangeX(0,3) returns [0 1 2 3].
2. RangeX has a fourth Parameter (@Row1) to determine if RN should begin with 0 or 1.
[Author]:
Alan Burstein
[Compatibility]:
SQL Server 2008+
[Syntax]:
SELECT r.RN, r.OP, r.DM, r.N, r.NOP
FROM core.RangeX(@Low,@High,@Gap) AS r;
[Parameters]:
@Low = BIGINT; The lowest value for N.
@High = BIGINT; The highest value for N.
@Gap = BIGINT; The distance between each N value.
-----------------------------------------------------------------------------------------
[Revision History]:
Rev 00.0 - 20140518 - Initial Development - AJB
Rev 06.0 - 20240625 - Final redesign - AJB
*****************************************************************************************/
RETURNS TABLE WITH SCHEMABINDING AS RETURN
WITH T(N) AS ( -- 90*90*90=729000, 729000*729000 = 531,441,000,000 Rows Total
SELECT 1 FROM (VALUES
(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
(0),(0)) T(N)), -- 90 values
L(N) AS (SELECT 1 FROM T CROSS JOIN T AS T2 CROSS JOIN T AS T3),
iTally AS (SELECT RN = ROW_NUMBER() OVER (ORDER BY (SELECT 1)) FROM L CROSS JOIN L AS L2)
SELECT TOP((@high-@low+1)/@gap+SIGN((@high-@low+1)%@gap))
RN = i.RN,
OP = o.OP,
DM = (i.RN-o.OP)/2,
N = @low+(i.RN-1)*@gap,
NOP = @low+(o.OP-1)*@gap
FROM iTally AS i
CROSS APPLY (VALUES((@high-@low+1)/@gap+SIGN((@high-@low+1)%@gap))) AS r(T)
CROSS APPLY (VALUES(r.T-i.RN+1)) AS o(OP)
ORDER BY RN;
GO