I have read in the documentation of the random library in python that the number of permutations of a big list does not fit in the period of random number generators:
random.shuffle(x)
Shuffle the sequence x in place.
To shuffle an immutable sequence and return a new shuffled list, use sample(x, k=len(x)) instead.
Note that even for small len(x), the total number of permutations of x can quickly
grow larger than the period of most random number generators. This implies that most
permutations of a long sequence can never be generated. For example, a sequence of
length 2080 is the largest that can fit within the period of the Mersenne Twister
random number generator.
To my understanding, this implies that there may be some bias if the randomness in your program comes from shuffling a big list. This is actually the reason why I’m asking this question in the first place, since I have some measurements with large deviations from a number of experiments.
I wonder if this is also a problem that is inherent in sampling from large populations in general. Does the limited period of random generators also affect weight-based sampling, like random.choices or random.sample, if the population is big?
10