Show Lecture.RandomNumbers as a slide show.
CS253 Random Numbers
Philosophy
“Computers can’t do anything truly random. Only a person can do that.”
- Stop trying to prove your superiority.
- If you believe that you have something special that distinguishes you
from machines, you’re talking religion, not CS.
- My dog is pretty random.
- You’re somewhat predictable.
- An online rock-paper-scissors
program beats people 60% of the time over more than a million games,
because people are lousy at being random.
Old Stuff
There are several C random number generators,
of varying degrees of standardization:
They still work ok, but avoid them for new C++ code.
They mix up generation and distribution something terrible.
Traditional Method
Traditional random number generators work like this:
unsigned long n = 1;
for (int i=0; i<5; i++) {
n = n * 16807 % 2147483647;
cout << n << '\n';
}
16807
282475249
1622650073
984943658
1144108930
- It’s fast, simple, and good enough for many tasks. However …
- What happens if
n
is zero?
- What number always follows 16807?
- How many possible states does this RNG
(Random Number Generator) have?
Overview
- In C++, random numbers have:
- Generators
Generate uniformly-distributed random integers,
typically zero or one to a big number.
- Distributions
Take uniformly-distributed random integers, and transform them into
other distributions with different ranges.
- Examples:
- Picking a card (uniform, but discrete)
- Rolling 3d6 (bell-shaped, but discrete)
- Human height (bell-shaped, continuous)
Generators
Default Engine
Define a random-number generator, and use ()
to generate a number.
This is not a function call, because gen
is an object, not a
function. It’s operator()
.
That sequence looks familiar …
#include <random>
#include <iostream>
using namespace std;
int main() {
default_random_engine gen;
for (int i=0; i<5; i++)
cout << gen() << '\n';
}
16807
282475249
1622650073
984943658
1144108930
I won’t bother with the #include
s in subsequent examples.
Mersenne Twister
- Here’s a different, 64-bit generator.
- Use
.min()
and .max()
to find out the range of a given generator.
mt19937_64 gen;
cout << "min=" << gen.min() << '\n'
<< "max=" << gen.max() << "\n\n";
for (int i=0; i<5; i++)
cout << gen() << '\n';
min=0
max=18446744073709551615
14514284786278117030
4620546740167642908
13109570281517897720
17462938647148434322
355488278567739596
Ranges
Not all generators have the same range:
mt19937_64 mt;
minstd_rand mr;
cout << "mt19937_64: " << mt.min() << "…" << mt.max() << '\n'
<< "minstd_rand " << mr.min() << "…" << mr.max() << '\n';
mt19937_64: 0…18446744073709551615
minstd_rand 1…2147483646
Hey, look! Zero is not a possible return value for minstd_rand
.
Save/Restore
A generator can save & restore state to an I/O stream:
ranlux24 gen;
cout << gen() << ' ';
cout << gen() << endl;
ofstream("state") << gen;
system("wc -c state");
cout << gen() << ' ';
cout << gen() << '\n';
ifstream("state") >> gen;
cout << gen() << ' ';
cout << gen() << '\n';
15039276 16323925
209 state
14283486 7150092
14283486 7150092
endl
! Isn’t that a sin? 😈 🔥
True randomness
random_device a, b, c;
cout << a() << '\n'
<< b() << '\n'
<< c() << '\n';
3417977716
120288297
66348059
random_device
is, ideally, truly random, and not pseudo-random.
- Intel computers have an RDRAND instruction.
- It might depend on random things like human typing intervals,
network packets arrival times, or radioactive decay.
- If true randomness isn’t available, it resorts to pseudo-random numbers.
- It could pause waiting for randomness to become available.
- Use it sparingly.
Cloudflare
The hosting service Cloudflare uses a unique source of randomness.
Seeding
minstd_rand a, b, c(123);
cout << a() << ' ' << a() << '\n';
cout << b() << ' ' << b() << '\n';
cout << c() << ' ' << c() << '\n';
48271 182605794
48271 182605794
5937333 985676192
- Great—we can “seed” the random number generator with a value.
- This way, we can reproduce our pseudo-random sequences.
- Consider random testing: we want to be able to reproduce the sequence
if we find an error.
- How to choose the random seed?
- It should probably be … random.
Seed with process ID
auto seed = getpid();
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
489033501
995878947
651212542
1962473743
771411889
- You can seed with your process id.
- OK for casual use, but the seed is easily guessed.
- Process IDs are usually 15- or 16-bit quantities, so there are
generally only 32768 or 65536 of them.
Somebody could easily try them all.
Seed with time
// seconds since start of 1970
auto seed = time(nullptr);
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
1246015865
1857317886
1444380150
1470137148
1393155993
- You can seed with a time-related value.
- Two runs may occur within the same second,
and so produce identical random sequences.
- OK for casual use, but the seed is easily guessed.
- There are only 86,400 seconds in a day.
Somebody could easily try them all.
Y2038
int biggest = 0x7fffffff;
time_t epoch = 0,
now = time(nullptr),
end = biggest,
endp1 = biggest + 1;
cout << "epoch:" << setw(12) << epoch << ' ' << ctime(&epoch);
cout << "now: " << setw(12) << now << ' ' << ctime(&now);
cout << "end: " << setw(12) << end << ' ' << ctime(&end);
cout << "end+1:" << setw(12) << endp1 << ' ' << ctime(&endp1);
epoch: 0 Wed Dec 31 17:00:00 1969
now: 1727408611 Thu Sep 26 21:43:31 2024
end: 2147483647 Mon Jan 18 20:14:07 2038
end+1: -2147483648 Fri Dec 13 13:45:52 1901
I hope that nobody’s still using 32-bit signed time representations by then!
Seed with more accurate time
Nanoseconds make more possibilities:
auto seed = chrono::high_resolution_clock::now()
.time_since_epoch().count();
cout << "Seed: " << seed << '\n';
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
Seed: 1727408611157651788
1414666483
1680793587
1655054417
445127313
1166637588
- There are 86,400,000,000,000 nanoseconds in a day.
Better Seeding
- Many generators have more than 32 or 64 bits of state.
- Therefore, you can seed them with more than 32 or 64 bits.
- If you’re doing something very important, and somebody guessing
your seed, and hence predicting your sequence, would be catastrophic:
- on-line poker
🂺 🂻 🂽 🂾 🂱
- encryption of military communications
⚔ 🔫 💣 🥆 ☢
- encrypted email re: extra-marital affairs 💔
- That’s beyond the scope of this discussion.
Seed with random_device
random_device gen;
auto seed = gen();
minstd_rand0 a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
220488652
1343483089
1277212265
2007486090
703136613
You can seed with random_device
, if you know that
it’s truly random.
Not good enough.
- Great, so we know how to generate a number 1…2,147,483,646
or perhaps 0…18,446,744,073,709,551,615
- How often do we want to do that?
- Sometimes, we want integers with different ranges.
- Or, perhaps we want floating-point numbers.
- Maybe spread out linearly, or a bell-shaped curve, Poisson, etc.
- This is a job for a distribution.
Distributions
- Uniform:
- Bernoulli (yes/no) trials:
- Piecewise distributions:
|
- Related to Normal distribution:
- Rate-based distributions:
|
uniform_int_distribution
auto seed = random_device()(); //❓❓❓
mt19937 gen(seed);
uniform_int_distribution<int> dist(1,6);
for (int y=0; y<10; y++) {
for (int x=0; x<40; x++)
cout << dist(gen) << ' ';
cout << '\n';
}
6 3 2 4 3 2 1 6 6 6 2 4 2 5 1 2 2 2 6 6 2 4 3 4 1 1 3 1 2 3 6 5 5 3 2 6 6 1 3 2
6 4 3 5 5 1 6 2 5 3 1 2 1 3 2 2 1 4 3 1 6 3 3 3 2 5 6 2 5 6 1 5 4 6 1 5 6 1 5 3
6 3 3 4 1 1 2 6 6 6 1 5 3 6 6 3 5 5 3 5 1 1 2 6 4 5 1 1 3 4 4 6 2 6 5 3 1 6 3 3
6 4 4 6 5 5 6 4 5 1 2 5 4 3 4 5 6 5 2 3 2 2 2 2 1 4 6 3 3 6 6 2 1 5 3 5 2 3 5 1
4 5 3 6 6 6 5 1 2 3 6 6 1 3 4 4 2 4 2 1 5 3 5 6 4 3 4 6 2 5 1 6 4 5 4 3 3 2 4 3
6 3 2 4 2 5 2 4 2 4 3 4 1 5 2 2 5 6 3 6 1 6 2 4 4 4 4 2 2 2 2 5 6 2 4 3 5 1 2 4
6 2 6 4 3 3 5 5 6 6 2 5 2 4 5 4 2 4 2 1 3 4 4 4 6 3 5 1 2 1 6 5 4 6 6 5 3 3 6 4
1 6 6 3 6 3 3 2 1 1 4 3 2 5 3 5 2 1 3 3 2 3 1 4 4 4 6 2 3 6 1 3 2 1 1 2 1 6 5 5
2 5 2 2 3 6 4 1 5 5 2 3 6 2 2 6 1 5 6 1 6 6 6 3 1 6 5 4 6 1 2 1 2 1 2 5 3 6 3 3
3 1 5 5 1 4 2 5 5 6 3 1 6 3 6 1 3 4 2 4 3 5 1 4 3 2 2 3 6 2 6 3 2 4 4 1 2 6 1 2
uniform_real_distribution
auto seed = random_device()();
ranlux48 gen(seed);
uniform_real_distribution<> dist(18.0, 25.0);
for (int y=0; y<10; y++) {
for (int x=0; x<10; x++)
cout << fixed << setprecision(3) << dist(gen) << ' ';
cout << '\n';
}
20.205 18.057 22.740 18.680 22.732 23.591 22.994 18.863 21.384 22.950
19.033 20.712 20.565 22.404 24.721 23.656 18.910 20.327 19.682 23.779
21.224 20.873 24.361 20.368 20.997 23.712 24.395 24.333 20.278 19.673
21.394 24.348 23.610 21.715 19.618 24.971 19.945 20.891 20.548 18.033
18.046 18.975 22.435 22.671 21.979 23.963 21.993 22.091 21.024 18.564
19.856 18.540 21.486 20.069 21.607 24.192 20.839 24.342 18.084 22.911
21.767 23.209 18.477 21.597 20.738 24.223 18.106 20.194 20.856 19.037
21.257 21.893 20.061 19.481 23.819 24.533 23.678 22.362 19.754 21.359
21.443 23.363 18.642 21.176 19.998 21.008 21.888 24.503 24.138 18.268
19.587 20.581 20.140 24.360 18.803 22.065 22.541 19.078 21.169 24.960
OMG—what’s that <>
doing there?
Binding
auto seed = random_device()();
minstd_rand gen(seed);
uniform_real_distribution<> dist(18.0, 25.0);
auto r = bind(dist, gen);
for (int y=0; y<10; y++) {
for (int x=0; x<10; x++)
cout << fixed << setprecision(3) << r() << ' ';
cout << '\n';
}
19.985 21.732 22.139 24.191 18.664 21.413 21.329 19.330 21.333 22.173
21.294 18.157 20.528 23.581 22.400 23.974 21.623 18.377 23.404 19.631
24.215 21.020 18.483 24.849 19.792 18.246 24.601 19.005 19.495 18.656
21.120 21.304 21.270 23.411 21.641 22.482 21.785 18.699 18.236 23.037
19.085 21.738 22.340 21.043 19.188 18.813 20.392 19.305 19.397 23.225
18.771 19.460 18.568 19.701 24.520 20.703 24.725 18.989 22.672 19.409
21.436 18.693 23.820 18.694 19.747 20.569 22.861 20.900 23.739 21.441
24.540 18.708 19.466 20.084 24.968 24.929 20.038 19.069 24.726 19.809
24.319 21.057 22.561 23.949 21.628 18.304 19.273 20.320 20.729 19.852
21.254 20.890 20.842 24.207 19.353 24.862 21.337 22.466 23.906 19.332
Binding with temporaries
auto seed = random_device()();
auto r = bind(uniform_real_distribution<>(18.0, 25.0), mt19937(seed));
for (int y=0; y<10; y++) {
for (int x=0; x<10; x++)
cout << fixed << setprecision(3) << r() << ' ';
cout << '\n';
}
19.887 21.278 24.741 20.644 23.119 18.069 21.328 20.468 21.417 21.862
23.793 21.062 19.758 19.576 21.133 20.377 22.687 22.312 22.283 18.659
18.492 19.531 23.947 18.845 24.108 24.828 19.001 19.990 22.686 20.580
18.574 18.073 20.314 19.013 23.342 23.973 19.351 22.515 18.574 22.324
18.054 19.249 19.380 20.654 19.810 21.931 18.249 24.864 20.942 23.696
20.808 22.607 21.895 19.973 23.584 21.264 19.327 18.427 25.000 21.491
24.587 24.840 19.923 23.355 19.341 20.194 19.468 23.262 21.871 19.287
19.607 23.197 24.726 19.335 24.323 23.707 23.303 18.283 23.854 24.015
21.486 20.313 20.584 21.404 23.899 19.278 19.960 19.158 18.806 20.186
21.239 21.497 23.970 21.778 21.275 18.291 20.759 20.741 22.032 19.212
Boolean Values
Yield true
42% of time:
auto seed = random_device()();
constexpr int nrolls = 1'000'000;
auto r = bind(bernoulli_distribution(0.42), knuth_b(seed));
int count=0;
for (int i=0; i<nrolls; i++)
if (r())
count++;
cout << "true: " << count*100.0/nrolls << "%\n";
true: 42.0106%
Histogram
auto seed = random_device()();
mt19937_64 gen(seed);
normal_distribution<> dist(21.5, 1.5);
auto r = bind(dist, gen);
map<int,int> tally;
for (int i=0; i<10000; i++)
tally[r()]++;
for (auto p : tally)
cout << p.first << ": " << string(p.second/100,'#') << '\n';
16:
17:
18: ###
19: ###########
20: #####################
21: ##########################
22: ####################
23: ###########
24: ###
25:
26:
27:
Passwords
random_device rd;
auto seed = rd();
ranlux24 gen(seed);
uniform_int_distribution<char> dist('a','z');
for (int y=0; y<8; y++) {
string pw;
for (int x=0; x<12; x++)
pw += dist(gen);
cout << "Password: " << pw << '\n';
}
Password: ndogjwlipihg
Password: hebpuzcfpslo
Password: gefpejtkgdxv
Password: epfjqounqxpb
Password: idwqlbfgijgk
Password: uzcwmepcnksv
Password: isyslyhivkos
Password: kwwpbnqiyiau
Even though we’re using uniform_int_distribution
, which has int
right there in its name, it’s
uniform_int_distribution
<
char
>
, so we get characters.
Think of them as 8-bit integers that display differently.
Passwords
With binding:
auto seed = random_device()();
ranlux24 gen(seed);
uniform_int_distribution<char> dist('a','z');
auto r = bind(dist, gen);
for (int y=0; y<8; y++) {
string pw;
for (int x=0; x<12; x++)
pw += r();
cout << "Password: " << pw << '\n';
}
Password: rfpyhotxlayi
Password: zvtifjedjfay
Password: opgwaxfnktyb
Password: eszidvezzeem
Password: fwxyebdmeyvi
Password: qwwfvhrsufly
Password: vgcbcspuqqcq
Password: ruzdhacturnj
Passwords
With extreme binding:
auto r = bind(uniform_int_distribution<char>('a','z'),
ranlux24((random_device())()));
for (int y=0; y<8; y++) {
string pw;
for (int x=0; x<12; x++)
pw += r();
cout << "Password: " << pw << '\n';
}
Password: nhkheoxhjdbo
Password: dmoveccxnvul
Password: rwucxlhblsxi
Password: hwhfzmaslfbc
Password: ksfvsocduflp
Password: nvxdqguxejmk
Password: rcwteuhqhhxk
Password: phpvadklirse