Show Lecture.BitManipulation as a slide show.
CS253 Bit Manipulation
Kat Dennings as Max Black
Bases
- We’re used to base 10, because we have that many toes. Really.
- There’s nothing intrinsically good or bad about base 10.
- Vulcans use base 12. Well, in my head-canon they do.
- In base N, the digits are 0…N−1.
- Base 8 has no digit 8, just as base 10 has no digit ⑩.
- Digits after 9 are usually letters: A…Z or a…z.
Writing numbers (mathematics)
- When writing numbers in mathematics, they are by default in base 10.
- A subscript indicates an explicit base:
278 + 436 = 3216
Writing numbers (C++ source code)
- In source code, if a number starts with
0b
or 0B
,
it’s binary (base 2).
- If a number starts with
0x
or 0X
,
it’s hexadecimal (base 16), usually called “hex”.
- “Hexa” (6) is Greek, whereas “decimal” (10) is Latin. Oh, well.
- Otherwise, if a number starts with
0
, it’s octal (base 8).
- Which implies that plain
0
is octal, not decimal!
- Otherwise, it’s decimal (base 10).
- “Decimal” doesn’t mean “digits after the decimal point” in this context.
- Write to maximize readability:
years in decimal, bitmasks in binary, octal, or hexadecimal.
cerr << 0b11 << ' ' << 0B101 << ' ' << 0x1a << ' ' << 0XfF << ' ' << 012 << ' ' << 34;
3 5 26 255 10 34
Place notation
Consider an ordinary, three-digit base 10 number.
It has a hundreds place, a tens place, and a ones place.
You know this so well that you’ve forgotten it.
So, 78910 means:
- 7⋅102 + 8⋅101 + 9⋅100
- 7⋅100 + 8⋅10 + 9⋅1
- 700 + 80 + 9
- 789
Place notation
Now, consider a base 8 number, such as 3758.
This means:
- 3⋅82 + 7⋅81 + 5⋅80
- 3⋅64 + 7⋅8 + 5⋅1
- 192 + 56 + 5
- 253
Place notation
A binary number, such as 101102, means:
- 1⋅24 + 0⋅23 + 1⋅22 + 1⋅21
+ 0⋅20
- 1⋅16 + 0⋅8 + 1⋅4 + 1⋅2 + 0⋅1
- 16 + 0 + 4 + 2 + 0
- 22
Place notation
Of course, in practice, we wouldn’t bother writing down
the terms with zero coefficients, since they contribute only
zeroes to the sum.
A binary number, such as 101102, means:
- 1⋅24 + 1⋅22 + 1⋅21
- 1⋅16 + 1⋅4 + 1⋅2
- 16 + 4 + 2
- 22
I’m thinking of a number…
- First we need a number for examples.
- How about 0644?
- The 0 in the beginning just means it is in octal.
- What is it though?
- In C++, it’s an int, just as
42
is.
0644 in octal… okayyyyyy…
- What is it in different bases?
- So, first what does 644 mean in octal?
- 6 in the 64’s place (64 = 8²)
- 4 in the 8’s place (8 = 8¹)
- 4 in the 1’s place (1 = 80)
What is 0644 in binary?
Hardest way is convert to decimal, then convert to binary.
- Hard way:
- 6⋅8² = 384
- 4⋅8¹ = 32
- 4⋅80 = 4
- Total = 420, or 256+128+32+4, or 1101001002
- Easy Way: Break it into pieces of 3 each
- Why 3? 8¹ = 23, that’s why!
- One digit in octal is three digits in binary!
- 110 100 100 Look! 1102 = 6 and 1002 = 4
What is 0644 in octal?
What is 0644 in decimal?
- Octal to decimal
- Not hard, we just need to do the math.
- It’s just arithmetic.
- 6⋅8² + 4⋅8¹ + 4⋅80
- 6⋅64 + 4⋅8 + 4⋅1
- 384 + 32 + 4
- 420
What is 0644 in hexadecimal?
- Hard way:
- Hexadecimal is base 16, so the places are 4096, 256, 16, 1
- (163, 16², 16¹, 160)
- How many of each?
- 1⋅256 + 10⋅16 + 4⋅1
- So, it’s 0x1a4
What is 0644 in hexadecimal?
- Easy way:
- We already know it is 110100100 in binary.
- 16¹ = 24
- Each hex digit is 4 binary digits.
- 1 1010 0100
- 1 is easy: 1
- 1010 is 8+2 = 10 = a
- 0100 is 4 … so 4
- 0x1a4
Why doesn’t the easy way always work?
It’s all about the ratios:
From | To | Ratio | Difficulty |
Hexadecimal | Binary | 16 = 24 | Easy |
Hexadecimal | Octal | 16 = 84/3 | Not so bad |
Hexadecimal | Decimal | 16 = 101.20412 | Hard |
Decimal | Octal | 10 = 81.1073 | Hard |
Decimal | Binary | 10 = 23.3219 | Hard |
Octal | Binary | 8 = 23 | Easy |
Everything’s simple, unless stupid decimal is involved.
So, our number is:
1101001002 or 6448 or 42010
or 1a416
- What’s the difference?!
- To a human? It’s the difference between a number and a headache.
- To the computer? Nothing!
The computer doesn’t care!
cout << 0b10000 << ' ' << 020 << ' ' << 16 << ' ' << 0x10 << '\n';
16 16 16 16
- It stores it all in binary.
- They all look the same in binary.
- They’re all just ints.
- In fact, even if it DID care, it can’t tell the difference anyway!
- If a memory cell contains 0001 1011 0000 1011, was it binary,
octal, decimal, or hex to begin with?
How does it know?
It doesn’t know—you have to tell it!
int x = 060 - 0x00000000000000000000006;
cout << x << '\n'
<< oct << x << '\n'
<< hex << x << '\n'
<< dec << x << '\n';
42
52
2a
42
- Alas, there is no
bin
.
- I/O manipulators such as hex and oct are “sticky”;
they persist until changed.
Refresher
C++ bit-manipulation tools:
- bitwise and (
&
), or (|
), exclusive-or (^
)
- shift left (
<<
), shift right (>>
)
- assignment operator versions
(
&=
, |=
, ^=
, <<=
, >>=
)
- octal constant (
0377
)
- hexadecimal constant (
0x3FF
)
- binary constant (
0b1010110111
)
- digit separator:
0b1'010'110'111
0x0123'4567'89ab'cdef
123'456'789
Precedence matters
cout << (1|2) << '\n';
3
However, the binary bitwise operators are low precedence:
cout << 1|2 << '\n';
c.cc:1: error: no match for 'operator|' in '
std::cout.std::basic_ostream<char>::operator<<(1) | (2 << 10)' (operand
types are 'std::basic_ostream<char>' and 'int')
It’s trying to evaluate (cout << 1) | (2 << '\n');
Setting a bit
int n = 32;
cout << hex << n << '\n'
<< hex << (n|02) << '\n';
20
22
Or, using assignment and binary literals:
int n = 32;
cout << hex << n << '\n';
n |= 0b10;
cout << hex << n << '\n';
20
22
Clearing bits
cout << hex << 126 << '\n'
<< hex << (126 & ~0xF) << '\n';
7e
70
Testing bits
for (char c='A'; c<'N'; c++)
cout << ((c & 0x01) ? "Odd: " : "Even: ")
<< "'" << c << "' (" << int(c) << ")\n";
Odd: 'A' (65)
Even: 'B' (66)
Odd: 'C' (67)
Even: 'D' (68)
Odd: 'E' (69)
Even: 'F' (70)
Odd: 'G' (71)
Even: 'H' (72)
Odd: 'I' (73)
Even: 'J' (74)
Odd: 'K' (75)
Even: 'L' (76)
Odd: 'M' (77)
Shifting
int n = 42;
cout << oct << n << '\n';
n >>= 1;
cout << n << '\n';
n >>= 2;
cout << n << '\n';
n >>= 1;
cout << n << '\n';
n >>= 1;
cout << n << '\n';
n >>= 1;
cout << n << '\n';
52
25
5
2
1
0
Puzzle
int a = 1<<4;
cout << a << '\n';
cout << 1<<4 << '\n';
16
14
Why does using a variable give a different result?
Parsing! <<
is left-associative, so
cout << 1<<4
parses as
(cout << 1) << 4
and not as
cout << (1 << 4)
.
Badness
The shift amount must be ≥ 0 and < word size in bits. That’s 0…31 for a
32-bit integer. Shifting a negative number of positions is not allowed.
static int con67 = 67, conm1 = -1;
cout << (1<<con67) << '\n';
cout << (0b1000 << conm1) << '\n';
8
0
Using static fooled the optimizer.
Badness
Java has two different right-shift operators:
>>
copies the sign bit
>>>
always shifts in a zero bit
System.out.println(42 >> 1);
System.out.println(-1 >> 1);
System.out.println(42 >>> 1);
System.out.println(-1 >>> 1);
21
-1
21
2147483647
In C++, when shifting negative values right, it is
implementation-defined whether or not the sign bit is propagated:
cout << (-1 >> 1) << '\n';
-1
- Most C++ programs use char variables to store raw bytes.
- The C++17 byte datatype is designed to replace that.
- A byte is a collection of bits, not a small integer.
- Arithmetic operations (
+
-
*
/
)
don’t work on a byte.
- Bit operations (
&
|
^
<<
>>
) do work.