gjrand boasting page.
Gjrand is free software (GPL v2 or v3) hosted at SourceForge.
The SourceForge project page:
Downloads.
gjrand home.
Pseudo-random number generator.
Quality.
- For any seed the period is guaranteed a multiple of 264.
- The sequence for a seed will not "run into" the sequence for another
for at least 264 steps.
- State size is 256 bits. This is small enough to not be a hassle in
most environments, but big enough to not compromise quality. (For instance
there are certain non-uniformities that tend to show up after using
the cube root of all possible states in the generator. This can be a
an actual practical problem with 64 bits, but not with 256.)
- Uniform double precision floating point generator gives 52 pseudo-random
bits (not 32 or 31 or even 24 as some other generators do, i mean what
is the point of using double precision if you're going to do that?).
- Not linear with respect to bitwise XOR. Does not have any other simple
algebraic structure. (Mostly you don't care about this, but if your
simulation has exactly the same algebraic structure as the generator,
then you care a lot.)
- Uniform generator passes the "BigCrush" random number test battery in the
TestU01 (v 1.2.3) package of Lecuyer and Simard. This is probably the
toughest of the well known test suites out there.
- Also passes diehard, tuftests, ent (1e10 bytes), FIPS 140-2 (800 failures
in 1e6 blocks, which as far as i can tell is about right).
- Passes my own test-suites
recently for uniform bits mcp --huge ,
for uniform floating point mcpf --huge ,
for normal bigpdb --huge .
- Both design and testing aim for different seeds getting different
sequences with an extremely good degree of statistical independence.
- Design and testing has aimed at achieving extremely good statistical
properties for simulation, Monte-Carlo algorithms and so on.
- Some people put a large emphasis on pseudo-random generators having
a sound theoretical basis. Me, less so, because i've seen too many
"theoretically sound" generators that were hopelessly bad. For gjrand,
i've concentrated on emperical methods with just a tiny amount of theory
as a guide in the hope that there won't be a disaster lurking just beyond
the longest test run i've done.
Features.
- Threadsafe and re-entrant.
- Several seeding functions available for repeatable sequences.
- 4 parameter seeding function. This could be useful for distributed
processing for instance where one parameter is the seed, another is
the work unit number, and two spares in case you think of something else.
Changing any parameter even by a small amount should result in a different
highly independent sequence.
- Designed so that sequences should be mostly repeatable across different
types of CPU including big-endian vs little-endian. Even different floating
point implementations should only result in small differences. However
i now only have one machine for testing, so let me know if you see trouble.
- Unpredictable seeding function using operating system data.
(But not unpredictable enough for security related uses.)
- Uniform integer variates, 8, 31, 32, or 64 bit.
- Uniform integer variates in a range [ 0 , n ).
- Other integer: Poisson, binomial, geometric, hypergeometric, multiple dice.
- Array shuffle numbers [ 0 , n).
- Large shuffle returning one number at a time, without having to store
all the numbers in memory at once.
- Uniform floating point in (0, 1), single and double precision.
(Also C type "long double" which is extended precision on some machines but
only double on others.)
- Other floating point: normal, exponential, chi-squared, Student t, gamma,
Pareto, Weibull, spherical.
- Version 2.3.0.0 offers a completely different generator with the same
API in case you want to rerun a simulation to be sure poor random number
quality is not an issue.
Speed.
Version 3.3.0 on a 64 bit AMD Athlon:
- uniform random bytes
- 2.2 clock cycles per byte. (If you don't care about consistent byte
ordering across platforms you can go about 1 clock cycle per byte using
C's weak typeing to put integers into a byte array.)
- uniform 32 bit integers
- 16 clock cycles per number one at a time.
- 4 clock cycles per number in bulk.
- uniform 64 bit integers
- 16 clock cycles per number one at a time.
- 7 clock cycles per number in bulk.
- uniform single precision reals in (0, 1)
- 8 clock cycles per number in bulk.
- uniform double precision reals in (0, 1)
- 18 clock cycles per number one at a time.
- 12 clock cycles per number in bulk.
- uniform 80 bit extended precision reals in (0, 1)
- 61 clock cycles per number one at a time.
- standard normal variates double precision real
- 66 clock cycles per variate one at a time.
- 14 clock cycles per variate in bulk.
This is probably not the fastest pseudo-random number library in existence,
but i claim it's faster than many, and fast enough for most purposes.
It's slower on 32 bit computers. I don't have measurements for recent versions,
but 1.5 to 5 times slower for various calls seems plausible.
Limitations.
Don't use gjrand for high security applications such as cryptography,
or as the random generator for gambling for actual money. The generator
algorithm, seeding methods, API, and coding style are all wrong for
such use.
Random number test suites.
The gjrand package also contains statistical testing programs that can be used
to test the gjrand library, or other supposedly random data:
- A test suite for uniform bits
- A test suite for uniform floating point numbers
- A test suite for normally distributed numbers
- Minor tests for other gjrand features
I like to think the first three are quite tough and reasonably fast for the
amount of statistical power they have. In particular, the first two reject
some well known pseudo-random number generators that pass some other
well known test suites and retain good reputations.
These suites read raw binary numbers from standard input. (On most systems
the best way is to write a program that writes binary numbers from your
chosen generator to standard output and then use a pipe. Examples are
provided.) I like this because it can test generators written in
different languages or programming paradigms without needing any common
ground other than simple binary I/O.