Lists

Open Menu Close Menu

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Random number generator without duplicates?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Random number generator without duplicates?

Subject: Re: Random number generator without duplicates?
From: Bill Briggs <email@hidden>
Date: Wed, 18 Apr 2001 09:41:10 -0300

... late to the party, but I can't listen to this nonsense any more

At 9:54 AM -0400 17/04/01, email@hidden wrote:

How can I get a list of random numbers that does not include any duplicates? I would want to use these parameters: six random numbers from 1 to 56, as a list or as six variables, no duplicate numbers.

That's a singularly peculiar request. You cannot have a randomly generated list of numbers if you impose such a condition. Random lists of length greater than 1 element will at times produce duplicates (or other multiple instances) in the list. It's a fact. Once you impose your condition of uniqueness of elements, the list you want is *not* a randomly generated list.

And later on, Bryan Harris chimed in:

I notice after a few trials that the random number function is generating
duplicates after only a few iterations.

So just out of curiosity I generated 56 random numbers "from 1 to 56". Here are the results:

* 39% (22) of the values were never generated
* 28% of the values were generated once
* 1 of the values was generated 4 times

Isn't the integer random number generator supposed to follow the same rules as the real, i.e. generate every value once before repeating?

Absolutely not. If that were the case, it would not be a random number generator. As that late professor Balusubramanian used to say, "follow me closely".

In a randomly generated sequence (like the toss of a fair die, the flip of a coin, the spin of a fair roulette wheel) each event is an independent event. It is not related to any past events (previous tosses of the die), nor does it affect any future events. The whole idea of whether events are independent or not is part of the core of probability theory. And the goal of random number generators is always to produce a result that is independent of all previous results. Once you impose a condition on the next event (e.g., that it can't be one of the previous events), then it is manifestly no longer random.

A random number generator that generates reals will behave in the same way; viz. all values are possible with each new number generated. Duplicates are possible, though on the real number line the density of possible values is so high that duplicates have a low likelihood of occurrence except as artifacts of machine generation by pseudo-random processes.

After which Chris Espinosa wrote:

> No, it's not, because that wouldn't be random. Randomness is "every
time I ask for a number from 1 to 10, I have P=0.1 chance of getting any
number from 1 to 10". If a random number generator were constructed as
you propose, if it generated, say, 5 on the first iteration, then on the
second iteration the probability of generating any number would be
P=0.1111111... except for 5, where it would be P=0. Then if it
generated the number 3, the probability on the third pass would be P=0
for 3 and 5 and P=0.2 for all others, until it got to the tenth cycle,
where it would be P=1 for a certain number and P=0 for all others.

> That's not random.

Which is completely correct.

Then M.S.R.F. Schonewille opined:

Sorry, but that's not true. You confuse the original sample (which is not
random but stochastic) and the random selection from this sample. Compare
choosing numbered pieces of papers from a hat. The last number will have a
probability P=1 of being chosen, the sequence still being perfectly random.

Sorry, but you're the one in error and Chris is perfectly correct. Selecting randomly from a fixed pool of items without replacement is NOT the same process as generating random numbers from within a range. Selecting numbers from a hat exhibits a *conditional probability* because the probability of each subsequent event is affected by all of the previous events. Random processes produce results where each result is from an *independent* event. As Chris pointed out in a completely clear fashion, the selection process that you are focused upon is not random but conditional. All strenuous exhortations to refute this are destined to fail. I urge you to consult some decent reference book if you are in doubt. (two excellent references given below)

It is absolutely NOT necessary to invoke a discussion of stochastic processes to clear up this issue. Stochastic analysis is necessary in the analysis of ensembles of functions, x(t) in which instances one needs to look at joint statistics in both the time domain and the ensemble domain (statistics at fixed values of t across the ensemble). Stochastic processes is beyond the scope of the current discussion and an unnecessary obfuscation.

Then Bryan Harris wrote (also in response to CE):

The problem is not this at all. The problem is: How do you get the
computer to generate a value x with probability p? The computer has no
concept of random (except for maybe the user's train of thought =), so the
algorithm must generate pseudo-random values that match the desired
distribution as closely as possible.

The error here is that the distribution of outcomes only levels out when the number of experiments in the space approaches infinity. If you want to see the effect, look at the distribution stats for numbers drawn in the 6/49 lottery that has been running in Canada for many years. This is not a bad looking distribution (that is, not severely uneven in terms of draws for each of the numbers), but is far from level yet, though much more *even* looking than it was several years ago when they first started publishing the stats.
http://lottery.sympatico.ca/cgi-bin/lottery.cgi

If you really want to see what the random number generator in AppleScript looks like, then pick a range of values and create a histogram over a huge number of experiments and see what you get. Check it at intervals to see what is happening. This is the way to see if there is a bias in the generator (introduced by it's pseudo-random process).

When done correctly with multipliers and the mod function, the generated
numbers have no obvious sequence, and have a full period, meaning that every value is generated once before repeating.

But that is not a random number generator (ignoring any machine generation problems)!!! The sequence you are talking about is based on conditional probabilities, and the random number generator should NOT operate that way. Random numbers are, as I've said, generated as the result of *independent* events. When you toss dice, they don't know what they were the last time they were tossed. They produce random results, and duplicates are certainly possible.

And he went on to opine (also in response to CE):

Just to clear a few things up...

Which it didn't

> There is no such rule.

Oh, no? The measure of the quality of a random variate generator is based
on its performance with regard to exactness (how closely the generated
values follow the specified distribution)

That only matters over VERY LARGE numbers of experiments. We're talking billions of data points. Think infinity. It does not apply to 56 values generated once in the range of 1 through 56. You are fixed on a SELECTION process, not a RANDOM GENERATION process.

If your uniform random number generator is not generating every number in
the range before repeating, then it does not have a full period, and it is
not *exact* in producing values from a true uniform distribution. Good
algorithms do this. I have one that's not terribly difficult if you need
it.

This is manifeltly false. Your algorithm is *not* a random number generator. Each new event (number generated) by your algorithm is tied to past events. Check your own code. I don't even have to see it. There are tests performed in there to see what has gone before and eliminate them as possibilities. This does not constitute a RANDOM generator, it's a random selector. It's random selection from a fixed pool, and the subject is combinations and permutations, NOT random variables.

In your defense, though, you may not have a choice in your implementation.
I'm guessing that your algorithm is simply scaling and rounding numbers from a U(0,1) distribution. Random variate algorithms generate streams of random numbers, and thus either AS or the OS is probably already generating a U(0,1) stream. There's not much point setting up a new stream every time a script asks for a random number between 1 and 56 since AS cannot expect that the next request will be identical.

If the generator reliably spat out every number from 1
to
n in just n iterations, it wouldn't really be random.

Sure it would. Random order, not random distribution as it's giving us now.

No, it wouldn't. For the Nth time, random means that it's an independent event, not tied by conditional probability to previous events.

Then Ed Stockly wrote:

It is a
command that will simply generate a single random number every time it's
called and one call has no relation to the next.

And Ed is right. This is the essence of the thing. Independent events.

It is not a "uniform random number generator"

Whatever that might be. It's an oxymoron as far as I'm concerned.

>> Random order, not random distribution as it's giving us now.

Currently it is now giving us random numbers. Both the distribution and the order of those numbers are, theoretically, random. Each number having no relation to the number(s) that preceed or follow. That's all it's designed to do.

Truly random numbers do repeat because every time any number is generated
every number within the range should have an equal probability of being
generated each time a number is generated.

And Ed is absolutely correct.

And then Michelle Steiner wrote:

in response to Bryan Harris <email@hidden> who wrote:

>Isn't the integer random number generator supposed to follow the same rules
as the real, i.e. generate every value once before repeating?

If it did that, it wouldn't be random. True randomness means that every
iteration is distinct from all other iterations. A true random-number
generator is capable of generating the same number 56 times out of 56,
for instance. It would happen once in a zillion times, give or take a
few dozen.

And Michelle is telling you the same thing that I'm saying and that Chris is saying and the Ed is saying. And she's right.

Assertions made by some on this list that random numbers should cover a range before repeating are absolutely incorrect. A random number generator's job is to generate a number within the range each time it is called to do so. And each time it runs it should generate any number within the range with the exact same probability as any other number, irrespective of the history of past numbers generated. This is the essence of randomness. Independent, not conditional events.

If you doubt any of the above, I suggest that you get yourself to the Science library of your nearest university and find a copy of "Probability, Random Variables, and Stochastic Processes" by Athanasios Papoulis, or the classic "Stochastic Processes" by Doob and start reading. The former is more accessible to the novice than the latter but has a very high information density and it not bathroom reading, however, perseverance is rewarded and you will emerge from the reading much enlightened about the concepts or conditional probabilities, joint probabilities, random processes, stochastic processes, etc. We use these texts in the graduate school as foundation courses for the statistical theory of communication.

- web

Follow-Ups:
- Re: Random number generator without duplicates?
  - From: Paul Skinner <email@hidden>

References:
	>Random number generator without duplicates? (From: email@hidden)

Prev by Date: Can't set Palm Desktop custom fields
Next by Date: Re: more info for person in Now Up-to-Date event
Previous by thread: Re: Random number generator without duplicates?
Next by thread: Re: Random number generator without duplicates?
Index(es):
- Date
- Thread