I want to write a function that return random unique pair numbers every time call it from a range until resetting it.
Something like this:
function randomUniquePairs($ranges, $reset = false){
if ($reset === false){
// some code for reseting
}
/*
some code for creating random unique pair numbers
*/
return $result;
}
randomUniquePairs(range(1,10), range(1,20));
/*
this function returns for example:
array(2,9)
*/
randomUniquePairs(range(1,10), range(1,20));
/*
this function returns for example:
array(5,19)
*/
randomUniquePairs(range(1,10), range(1,20));
/*
this function returns for example:
array(5,9)
*/
//this function returns random unique pairs until we pass reset paramer true
I tried two approaches:
Approach 1
My first attempt was to make an array of all of possible pairs then select from them randomly, but it is very inefficient, because if ranges are so wide, it consumes a lot of memory.
the code:
class a {
private $asqar;
function __construct() ($range) {
// cycle for ranges
foreach ($range[0] as $v1) {
foreach ($range[1] as $v2) {
$asqar[] = array(
$v1,
$v2,
);
}
}
}
function randomUniquePairs($reset = false){
if ($reset === true){
$this->asgar = array();
}
$rndKey = array_rand($this->asgar);
$result = $this->asqar[$rndkey];
unset($this->asqar[$rndkey]);
return $result;
}
}
$c = new a(range(1,10), range(1,20));
$c->randomUniquePairs();
Approach 2
My second attempt was to write a function that produce a pair from these ranges, then store it in a variable, every time that function calls after producing a pair , it checks if this pair produced before call function recursively, it continues until it produce a unique pair.
Here’s the code:
class a{
private $__uniqueVariables = array();
public function randomUniquePairs($range, $reset = false) {
if ($reset === true){
$this->__uniqueVariables = array();
}
// cycle for each value
foreach ($range as $value) {
// shuffle
shuffle($value);
// selected id
$selectedId[] = $value[0];
}
// check for selected variable
if (in_array($rndUnqNum, $this->__uniqueVariables)) {
// return for try again
return $this->uniqueVariable($range);
}
// added to current unique variables
$this->__uniqueVariables[] = $rndUnqNum;
// return
return $rndUnqNum;
}
}
but this second approach has an issue that sometimes throw Fatal error: Maximum function nesting level of '100' reached
.
I’m looking for a better approach to my algorithm.
7
Something you need to determine is how ‘fair’ you want this to be and the performance as the unused space becomes exhausted.
In essence, you want a random box from a N dimensional array. Its not the box itself that is important, but the location of the box that is important.
Under the covers, a 10×20 array is often represented as a 1×200 array with some math behind it to access the right spot. If you were to access [5][13]
, you are actually accessing location [5*20 + 13]
. You can use the same approach to go from a number back to the position. Location 113 goes to the integer devision by 20 and remainder giving 5 r13
.
So now, you don’t need to actually store 200 pairs (though that isn’t a lot), you just need a bitfield of 200 bits long. Generate a random number within the proper range and mark it as used in the bitfield.
Now, the question of how do you handle it when you’ve got a collision? This goes to the various hash collision techniques used in hash tables. Some won’t work for this application, but its a good read nonetheless.
A simple approach would be once you have one collision, just start incrementing where you are looking at until you find one that wasn’t used. The increment could be 1, or any number that is relativity prime to the size of the space.
Ok, I’m at a place I can sit down and write something. Its perl. Shouldn’t be too far off of php and is rather straight forward.
#!/usr/bin/perl
my $x = shift @ARGV;
my $y = shift @ARGV;
my $f = shift @ARGV; # fill factor
my $t = $x * $y; # total space
my $v = '';
foreach (1 .. int($t * $f)) {
my $r = int(rand($t)); # random number from 0 .. $t-1
my $yp = int($r / $x); # y' (y prime)
my $xp = int($r % $x); # x' (x prime)
print "Trying $r: $xp $yp...n";
while(vec($vec, $r, 1)) {
$yp = int($r / $x);
$xp = int($r % $x);
print "tcollision at $r: $xp $ypn";
$r += 1;
$r %= $t; # scale $r to within $t
}
$yp = int($r / $x);
$xp = int($r % $x);
vec($vec, $r, 1) = 1; # set the $r th bit
print "tsettled at $r: $xp $ypn";
}
Ultimately, the ‘settled’ values are the ones that you want.
You read two numbers from the command line and assign them to x
and y
. The total search space is t
– this is how big all the possible numbers are. Additionally, read a fill factor as $f
, this should be a value less than 1 and is used to limit the list iteration. Setting a value greater than 1 will present an infinite loop.
I’m filling up the space to the specified value (the foreach (1 .. ($t * $f))
). No, there isn’t any error checking on $f
to make sure it is less than or equal to 1, but there should be.
So pick a random number from 0 to $t
– 1. The spot that this represents is $yp
and $xp
– y prime and x prime.
There is some perlish things here, vec works with a bit vector of arbitrary size. There are a number of ways of doing this in a given language, its just rather easy with perl. With Java, one could use a BitSet (this is big enough to hold maxint bits, which could represent a pair of 46340 numbers).
You then test the bit at the $r
th location to see if it has been used. If it has, increment $r
and roll it over if it becomes larger than the total space (so if you have a 10 and 20 (0 .. 199) and you hit 200, it becomes 0).
Once you find an unused bit, set it and output your values.
This is what the output looks like though:
Trying 96: 6 9...
settled at 96: 6 9
Trying 117: 7 11...
collision at 117: 7 11
settled at 118: 8 11
Trying 115: 5 11...
settled at 115: 5 11
Trying 153: 3 15...
settled at 153: 3 15
Trying 90: 0 9...
settled at 90: 0 9
Trying 140: 0 14...
collision at 140: 0 14
collision at 141: 1 14
collision at 142: 2 14
settled at 143: 3 14
Trying 73: 3 7...
settled at 73: 3 7
This is tested on a unix system thus:
% rndpair.pl 10 20 0.5 | grep settled | sort | uniq | wc -l
100
This shows that with 100 numbers there are 100 unique pairs printed out (just look at the ‘settled’ lines).
There’s a fair bit of debugging information in there (you don’t really need to keep resetting the value of $yp
and $xp
until the end.
And then there’s the question of how fast is it? This is to generate 50% of the available pairs for the available space. Realize that there is some time tied up in the sort and uniq (of possibly some not small bits of text):
% time ./rndpair.pl 500 500 0.5 | grep settled | sort| uniq | wc -l
125000
real 0m3.528s
user 0m3.901s
sys 0m0.045s
Lets kick it up a notch and remove the other applications from the chain.
% time ./rndpair.pl 5000 5000 0.5 > /dev/null
real 0m34.668s
user 0m34.482s
sys 0m0.180s
How much room did that 5k x 5k storage space take? 25,000,000 bits, or about 3 megabytes.
Note that the performance of this drops as the fill factor goes up. For a search space of 6, 80610 (a previous comment if I read it right) this runs quite quickly (note the increasing times as the fill factor goes up):
% time ./rndpair.pl 6 80610 0.5 | grep settled | wc -l
241830
real 0m0.827s
user 0m1.562s
sys 0m0.029s
% time ./rndpair.pl 6 80610 0.75 | grep settled | wc -l
362745
real 0m1.721s
user 0m3.151s
sys 0m0.043s
% time ./rndpair.pl 6 80610 0.9 | grep settled | wc -l
435294
real 0m3.993s
user 0m7.031s
sys 0m0.079s
1
The following is a re-write to use for instead of foreach.
<?php
function unique_arrays($first, $second) {
$arrays = array();
for ($i = $first[0]; $i <= $second[0]; $i++)
for ($j = $first[1]; $j <= $second[1]; $j++)
$arrays[] = array($i, $j);
return $arrays;
}
print_r(unique_arrays(array(1,6), array(3,8)));
?>
Here is an iterator-inspired class that should keep the memory down.
<?php
class XYStepper {
protected $first;
protected $second;
protected $current = null;
public function __construct($first, $second) {
$this->first = $first;
$this->second = $second;
}
public function next() {
if ($this->current == null)
$this->current = $this->first;
else {
if ($this->current[0] < $this->second[0]) {
if ($this->current[1] < $this->second[1])
$this->current[1]++;
else {
$this->current[0]++;
$this->current[1] = $this->first[1];
}
}
else {
if ($this->current[1] < $this->second[1])
$this->current[1]++;
else
return null;
}
}
return $this->current;
}
}
$xy = new XYStepper(array(1,6), array(3,8));
while (($next = $xy->next()) != null)
print_r($next);
?>
Adding the ability to step down, or have negative increments is left as an exercise to the reader.
2
This is based on @MichaelT comments and code. I started with this before he posted it but i have been peeking after he posted his code so it’s not entirely original work.
I reserve where the numbers in both ranges are the same and i add the offset so that it’s more aligned with the original spec.
<?php
class Gen
{
private $mask; // bitmask all combinations
private $firstOffset; // what first range starts from
private $secondOffset; // what second range starts from
private $firstWidth; // the width of first range
private $secondWidth; // the width of second range
private $permutations; // number of permutations
private $used = 0; // number of permutations used
public function __construct(array $range1, array $range2)
{
if( ! count($range1) || ! count($range2) )
throw new Exception("Both ranges must contain at least one element");
// I support both array(1, 2, 3, 4) and array(1, 4) as a range
$this->firstOffset = $range1[0];
$this->secondOffset = $range2[0];
$firstEnd = array_pop($range1);
$secondEnd = array_pop($range2);
if( $firstEnd < $this->firstOffset ||
$secondEnd < $this->secondOffset )
throw new Exception('End lower than start in at least one range.' .
"Given [{$this->firstOffset}, $firstEnd] " .
"and [{this->secondOffset}, $secondEnd]");
$this->firstWidth = 1 + $firstEnd - $this->firstOffset;
$this->secondWidth = 1 + $secondEnd - $this->secondOffset;
$this->permutations = $this->firstWidth*$this->secondWidth; // with no overlap
$this->reset();
}
public function next()
{
if( $this->used != $this->permutations ) {
$random_permutation = rand(0,$this->permutations);
if( ($found = gmp_scan0($this->mask, $random_permutation)) == $this->permutations )
$found = gmp_scan0($this->mask, 0);
$this->used++;
gmp_setbit($this->mask, $found);
$first = $found % $this->firstWidth + $this->firstOffset;
$second = floor($found / $this->firstWidth) + $this->secondOffset;
return array( $first, $second);
}
throw new Exception("No more pairs available");
}
public function reset()
{
$this->mask = gmp_init(0);
$this->used = 0;
// Make simple variables of
// so that the original ranges
// are [$s1, $e1] and [$s2, $e2]
$s1 = $this->firstOffset;
$s2 = $this->secondOffset;
$e1 = $s1 + $this->firstWidth - 1;
$e2 = $s2 + $this->secondWidth - 1;
if( $s1 <= $e2 && $s2 <= $e1 ) { // if the rhe ranges overlap
$is = max($s1, $s2); // get the start of intersect
$ie = min($e1, $e2); // get the end of intersect
$this->used += 1 + $ie - $is; // the count of emelemts in common
$i1end = $ie - $s1 + 1; // calculate the end element for first range
for( $i1 = $is - $s1, // initialize range1 by reducing by s1
$i2 = $is - $s2; // initialize range2 by reducing by s2
$i1 < $i1end; // as long as $i1 is lower than the calculates stop
$i1++, $i2++) { // increment both indexes by one for next iteration
// body of for loop sets bitmask where both
// would get the same value when offset is added
gmp_setbit( $this->mask, $i1 + $i2 * $this->firstWidth );
}
}
}
}
//$g = new Gen(range(1,6), range(1,10));
//$g = new Gen(array(3,4), array(5));
$g = new Gen(array(1,6), array(3291,83901));
try {
while(1) {
print implode(", ", $g->next()) . "n";
}
} catch ( Exception $e ) {
print "Done!n";
}
2