Show Lecture.RandomNumbers as a slide show.
CS253 Random Numbers
Philosophy
“Computers can’t do anything truly random. Only a person can do that.”
- Stop trying to prove your superiority.
- If you believe that you have something special that distinguishes you
from machines, you’re talking religion, not CS.
- My dog is pretty random.
- You’re somewhat predictable.
- An online rock-paper-scissors
program beats people 60% of the time over more than a million games,
because people are lousy at being random.
Old Stuff
There are several C random number generators,
of varying degrees of standardization:
They still work ok, but avoid them for new C++ code.
They mix up generation and distribution something terrible.
Traditional Method
Traditional random number generators work like this:
unsigned long n = 1;
for (int i=0; i<5; i++) {
n = n * 16807 % 2147483647;
cout << n << '\n';
}
16807
282475249
1622650073
984943658
1144108930
- It’s fast, simple, and good enough for many tasks. However …
- What happens if
n
is zero?
- What number always follows 16807?
- How many possible states does this RNG
(Random Number Generator) have?
Overview
- In C++, random numbers have:
- Generators
Generate uniformly-distributed random integers,
typically zero or one to a big number.
- Distributions
Take uniformly-distributed random integers, and transform them into
other distributions with different ranges.
- Examples:
- Picking a card (uniform, but discrete)
- Rolling 3d6 (bell-shaped, but discrete)
- Human height (bell-shaped, continuous)
Generators
Default Engine
Define a random-number generator, and use ()
to generate a number.
This is not a function call, because gen
is an object, not a
function. It’s operator()
.
That sequence looks familiar …
#include <random>
#include <iostream>
using namespace std;
int main() {
default_random_engine gen;
for (int i=0; i<5; i++)
cout << gen() << '\n';
}
16807
282475249
1622650073
984943658
1144108930
I won’t bother with the #include
s in subsequent examples.
Mersenne Twister
- Here’s a different, 64-bit generator.
- Use
.min()
and .max()
to find out the range of a given generator.
mt19937_64 gen;
cout << "min=" << gen.min() << '\n'
<< "max=" << gen.max() << "\n\n";
for (int i=0; i<5; i++)
cout << gen() << '\n';
min=0
max=18446744073709551615
14514284786278117030
4620546740167642908
13109570281517897720
17462938647148434322
355488278567739596
Ranges
Not all generators have the same range:
mt19937_64 mt;
minstd_rand mr;
cout << "mt19937_64: " << mt.min() << "…" << mt.max() << '\n'
<< "minstd_rand " << mr.min() << "…" << mr.max() << '\n';
mt19937_64: 0…18446744073709551615
minstd_rand 1…2147483646
Hey, look! Zero is not a possible return value for minstd_rand
.
Save/Restore
A generator can save & restore state to an I/O stream:
ranlux24 gen;
cout << gen() << ' ';
cout << gen() << endl;
ofstream("state") << gen;
system("wc -c state");
cout << gen() << ' ';
cout << gen() << '\n';
ifstream("state") >> gen;
cout << gen() << ' ';
cout << gen() << '\n';
15039276 16323925
209 state
14283486 7150092
14283486 7150092
endl
! Isn’t that a sin? 😈 🔥
True randomness
random_device a, b, c;
cout << a() << '\n'
<< b() << '\n'
<< c() << '\n';
3155182047
3736403520
20593201
random_device
is, ideally, truly random, and not pseudo-random.
- Intel computers have an RDRAND instruction.
- It might depend on random things like human typing intervals,
network packets arrival times, or radioactive decay.
- If true randomness isn’t available, it resorts to pseudo-random numbers.
- It could pause waiting for randomness to become available.
- Use it sparingly.
Cloudflare
The hosting service Cloudflare uses a unique source of randomness.
Seeding
minstd_rand a, b, c(123);
cout << a() << ' ' << a() << '\n';
cout << b() << ' ' << b() << '\n';
cout << c() << ' ' << c() << '\n';
48271 182605794
48271 182605794
5937333 985676192
- Great—we can “seed” the random number generator with a value.
- This way, we can reproduce our pseudo-random sequences.
- Consider random testing: we want to be able to reproduce the sequence
if we find an error.
- How to choose the random seed?
- It should probably be … random.
Seed with process ID
auto seed = getpid();
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
2113660642
1564781012
101914321
1768637361
681666346
- You can seed with your process id.
- OK for casual use, but the seed is easily guessed.
- Process IDs are usually 15- or 16-bit quantities, so there are
generally only 32768 or 65536 of them.
Somebody could easily try them all.
Seed with time
// seconds since start of 1970
auto seed = time(nullptr);
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
1707019592
595190042
1382287816
2098253846
892673158
- You can seed with a time-related value.
- Two runs may occur within the same second,
and so produce identical random sequences.
- OK for casual use, but the seed is easily guessed.
- There are only 86,400 seconds in a day.
Somebody could easily try them all.
Y2038
int biggest = 0x7fffffff;
time_t epoch = 0,
now = time(nullptr),
end = biggest,
endp1 = biggest + 1;
cout << "epoch:" << setw(12) << epoch << ' ' << ctime(&epoch);
cout << "now: " << setw(12) << now << ' ' << ctime(&now);
cout << "end: " << setw(12) << end << ' ' << ctime(&end);
cout << "end+1:" << setw(12) << endp1 << ' ' << ctime(&endp1);
epoch: 0 Wed Dec 31 17:00:00 1969
now: 1732267361 Fri Nov 22 02:22:41 2024
end: 2147483647 Mon Jan 18 20:14:07 2038
end+1: -2147483648 Fri Dec 13 13:45:52 1901
I hope that nobody’s still using 32-bit signed time representations by then!
Seed with more accurate time
Nanoseconds make more possibilities:
auto seed = chrono::high_resolution_clock::now()
.time_since_epoch().count();
cout << "Seed: " << seed << '\n';
minstd_rand a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
Seed: 1732267361325701588
300338834
2141238764
1348446934
652610544
725951581
- There are 86,400,000,000,000 nanoseconds in a day.
Better Seeding
- Many generators have more than 32 or 64 bits of state.
- Therefore, you can seed them with more than 32 or 64 bits.
- If you’re doing something very important, and somebody guessing
your seed, and hence predicting your sequence, would be catastrophic:
- on-line poker
🂺 🂻 🂽 🂾 🂱
- encryption of military communications
⚔ 🔫 💣 🥆 ☢
- encrypted email re: extra-marital affairs 💔
- That’s beyond the scope of this discussion.
Seed with random_device
random_device gen;
auto seed = gen();
minstd_rand0 a(seed);
for (int i=0; i<5; i++)
cout << a() << '\n';
1341145264
640093136
1299748929
676592419
562875268
You can seed with random_device
, if you know that
it’s truly random.
Not good enough.
- Great, so we know how to generate a number 1…2,147,483,646
or perhaps 0…18,446,744,073,709,551,615
- How often do we want to do that?
- Sometimes, we want integers with different ranges.
- Or, perhaps we want floating-point numbers.
- Maybe spread out linearly, or a bell-shaped curve, Poisson, etc.
- This is a job for a distribution.
Distributions
- Uniform:
- Bernoulli (yes/no) trials:
- Piecewise distributions:
|
- Related to Normal distribution:
- Rate-based distributions:
|
uniform_int_distribution
auto seed = random_device()(); //❓❓❓
mt19937 gen(seed);
uniform_int_distribution<int> dist(1,6);
for (int y=0; y<10; y++) {
for (int x=0; x<40; x++)
cout << dist(gen) << ' ';
cout << '\n';
}
3 4 5 4 2 6 2 4 2 2 3 5 3 6 3 6 6 6 6 3 1 1 3 1 5 2 2 2 5 6 1 2 1 5 4 6 3 1 3 2
1 4 4 2 5 1 5 1 6 6 6 2 6 4 4 1 5 1 4 1 3 1 4 3 6 5 3 6 3 1 1 6 1 5 1 5 5 4 3 6
6 1 2 4 2 4 4 4 6 6 5 6 2 5 6 4 5 6 2 4 1 4 3 5 2 4 1 1 1 4 6 5 1 5 6 1 3 2 6 2
2 6 3 4 1 3 6 2 2 3 4 4 3 6 6 6 1 4 4 1 2 1 6 2 3 4 4 1 5 5 4 4 5 4 5 1 4 3 2 1
2 1 5 2 5 5 1 6 1 1 6 6 6 2 2 1 4 2 4 3 1 2 2 5 2 6 4 3 4 2 4 2 1 1 6 1 6 5 1 3
5 6 1 1 6 4 2 5 5 2 2 3 3 5 5 1 5 4 1 5 1 2 3 1 4 1 6 3 5 4 4 6 5 1 3 2 5 3 4 4
3 2 4 6 2 4 5 4 5 4 4 2 1 1 2 5 1 4 2 3 3 6 4 4 6 4 5 3 1 1 2 4 3 5 4 3 6 2 1 3
1 1 5 3 2 5 1 3 2 4 5 4 4 6 3 2 1 6 1 1 1 5 5 6 2 4 5 2 5 5 4 5 5 5 6 1 2 4 4 4
3 1 2 1 3 4 3 5 4 6 3 5 1 1 5 3 2 5 4 4 1 3 1 3 2 4 6 3 5 1 5 2 3 2 4 2 2 2 1 3
3 2 4 6 2 6 3 1 3 3 6 2 3 4 6 2 1 2 3 5 5 5 3 5 3 1 5 1 1 6 1 4 2 2 1 4 4 2 3 5
uniform_real_distribution
auto seed = random_device()();
ranlux48 gen(seed);
uniform_real_distribution<> dist(18.0, 25.0);
for (int y=0; y<10; y++) {
for (int x=0; x<10; x++)
cout << fixed << setprecision(3) << dist(gen) << ' ';
cout << '\n';
}
19.843 19.636 19.850 23.660 22.603 23.162 24.985 24.129 22.510 18.429
24.353 23.016 22.564 19.598 20.245 22.829 19.947 20.390 24.521 20.296
24.000 20.162 19.691 20.387 19.878 20.898 24.466 18.711 19.940 18.677
20.672 22.269 24.274 24.876 22.286 24.759 18.111 22.630 22.487 21.549
20.507 24.812 18.580 23.762 20.616 24.099 21.612 20.038 18.729 23.885
24.967 24.640 18.122 20.847 22.167 22.492 21.877 21.046 22.678 22.644
18.501 19.708 20.813 22.044 24.119 21.253 18.532 24.689 24.209 18.354
22.050 18.555 19.339 21.186 24.185 22.300 19.477 24.525 20.312 23.611
21.649 18.378 19.038 21.872 23.756 19.086 18.057 23.390 21.587 23.941
22.350 20.377 22.331 20.364 23.229 23.632 21.366 24.357 21.986 23.117
OMG—what’s that <>
doing there?
Binding
auto seed = random_device()();
minstd_rand gen(seed);
uniform_real_distribution<> dist(18.0, 25.0);
auto r = bind(dist, gen);
for (int y=0; y<10; y++) {
for (int x=0; x<10; x++)
cout << fixed << setprecision(3) << r() << ' ';
cout << '\n';
}
20.115 22.107 18.819 18.697 19.237 24.757 18.124 22.164 19.535 20.248
19.031 22.660 24.986 21.730 23.842 22.643 21.970 24.107 24.290 21.489
23.865 18.371 22.831 18.097 19.918 23.211 19.767 23.378 22.593 19.555
18.152 24.875 24.128 20.592 18.904 24.444 23.533 23.371 22.488 21.151
22.886 23.518 20.287 18.271 22.562 20.128 23.214 24.471 22.964 19.603
24.238 22.030 22.095 24.853 18.420 19.021 18.027 20.469 20.943 23.136
21.724 19.145 24.507 23.689 24.133 23.336 18.175 21.472 24.668 21.948
22.319 24.095 23.464 22.473 18.783 23.595 23.039 20.308 20.970 18.037
20.502 24.790 24.322 21.772 18.524 24.475 18.124 24.081 24.594 18.121
18.790 21.085 20.131 23.296 24.021 22.297 20.616 18.253 23.755 24.231
Binding with temporaries
auto seed = random_device()();
auto r = bind(uniform_real_distribution<>(18.0, 25.0), mt19937(seed));
for (int y=0; y<10; y++) {
for (int x=0; x<10; x++)
cout << fixed << setprecision(3) << r() << ' ';
cout << '\n';
}
18.566 23.586 23.109 20.600 18.352 22.019 20.380 20.366 18.823 22.895
19.450 22.267 18.489 20.143 22.312 20.863 23.109 20.645 19.314 20.127
23.390 21.906 21.102 20.648 21.759 24.563 18.943 18.351 23.540 23.924
22.919 22.178 20.589 18.645 19.661 22.803 23.312 23.864 20.839 18.564
21.771 18.928 24.086 18.881 20.769 23.737 19.772 24.154 23.183 23.257
23.465 22.357 18.587 21.901 20.695 18.439 19.672 21.366 21.206 24.228
23.019 20.153 22.246 21.530 23.072 24.230 23.960 21.267 24.177 20.081
21.916 23.811 20.739 23.993 24.975 21.460 22.720 22.471 24.592 21.820
20.587 24.044 18.977 23.030 21.421 18.623 21.486 20.935 22.687 23.622
21.695 22.896 22.744 23.634 24.755 24.883 23.169 22.277 23.408 20.036
Boolean Values
Yield true
42% of time:
auto seed = random_device()();
constexpr int nrolls = 1'000'000;
auto r = bind(bernoulli_distribution(0.42), knuth_b(seed));
int count=0;
for (int i=0; i<nrolls; i++)
if (r())
count++;
cout << "true: " << count*100.0/nrolls << "%\n";
true: 41.9302%
Histogram
auto seed = random_device()();
mt19937_64 gen(seed);
normal_distribution<> dist(21.5, 1.5);
auto r = bind(dist, gen);
map<int,int> tally;
for (int i=0; i<10000; i++)
tally[r()]++;
for (auto p : tally)
cout << p.first << ": " << string(p.second/100,'#') << '\n';
15:
16:
17:
18: ###
19: ###########
20: ####################
21: #########################
22: #####################
23: ##########
24: ###
25:
26:
27:
Passwords
random_device rd;
auto seed = rd();
ranlux24 gen(seed);
uniform_int_distribution<char> dist('a','z');
for (int y=0; y<8; y++) {
string pw;
for (int x=0; x<12; x++)
pw += dist(gen);
cout << "Password: " << pw << '\n';
}
Password: rzxwdvdrhepl
Password: rcanhtuqvoph
Password: hmvluzodiuuw
Password: cmmbdkkqqgmn
Password: yvorftnmfuvy
Password: tpsjyvliwmry
Password: ihswhsbijahp
Password: odgmxhmlxwmi
Even though we’re using uniform_int_distribution
, which has int
right there in its name, it’s
uniform_int_distribution
<
char
>
, so we get characters.
Think of them as 8-bit integers that display differently.
Passwords
With binding:
auto seed = random_device()();
ranlux24 gen(seed);
uniform_int_distribution<char> dist('a','z');
auto r = bind(dist, gen);
for (int y=0; y<8; y++) {
string pw;
for (int x=0; x<12; x++)
pw += r();
cout << "Password: " << pw << '\n';
}
Password: dfwyxdwgocwv
Password: yzjgrlysfaku
Password: djclajjgjeba
Password: nzggvdkrbyjh
Password: ohgdrcpptgxt
Password: nnvouwhioupc
Password: zwgqrdnbdxbw
Password: itkqhhnidlci
Passwords
With extreme binding:
auto r = bind(uniform_int_distribution<char>('a','z'),
ranlux24((random_device())()));
for (int y=0; y<8; y++) {
string pw;
for (int x=0; x<12; x++)
pw += r();
cout << "Password: " << pw << '\n';
}
Password: stzoynthcwda
Password: kxvpiatenngg
Password: pihebdqjetzf
Password: vjvgfzaqebdy
Password: brpjnhoiqvkx
Password: rikkasighnkn
Password: xhlkexctodsw
Password: qllcewoqjzur