Show Lecture.Algorithms as a slide show.
CS253 Algorithms
First published computer algorithm, by Ada Lovelace
Definition
- In Computer Science, algorithm generally means
“how to do something”.
- In C++, algorithm refers to templatized functions
from the <algorithm> header file.
- There are many algorithms in <algorithm>. We will focus
on a few:
Arguments
- Algorithms generally take their input from half-open
iterator ranges, which always (?) come first.
- For output, algorithms take a single iterator, which says
where the output starts.
- A second iterator indicating the end of the output is not required,
since the length of the output is determined by the size of the input,
possibly filtered in some way, as in copy_if().
- Additional arguments may specify a value to look for,
a predicate to select items, etc.
vector<int> v = {1, 1,2, 1,2,3, 1,2,3,4};
cout << count(v.begin(), v.end(), 2) << '\n'
<< count(v.begin(), v.end(), 3.0) << '\n';
3
2
- count() counts how many times something occurs. 🤯
- The first two arguments form a half-open interval, which is exactly
what
.begin()
and .end()
give, since .end()
“points” one
past the last element.
- Each element in the range is compared to the third argument,
which does not have to be the same type as the items
in the half-open range.
- The half-open range can be two iterators into any sort
of container. As long as the first iterator can be incremented,
and compared to the second iterator, and assuming that the first
iterator will eventually become equal to the second, it’s ok.
- Pointers are iterators, so pointers into C arrays, C strings, or
dynamic memory are ok.
bool small(int n) {
return n < 5;
}
int main() {
multiset<int> ms = {3,1,4,1,5,9,2,6,5,3,5,8,9,7,9};
cout << count_if(ms.begin(), ms.end(), small) << '\n'
<< count_if(ms.begin(), ms.end(), [](int n){return n>=8;}) << '\n';
}
6
4
count_if() is like count(), except it takes a predicate
(a function that returns a bool) instead of a target value.
- The find() algorithm searches a half-open range for a value.
- If it finds the value, it returns:
- not an index to the value found ✘
- not a pointer to the value found ✘
- an iterator that “points” to the value found. ✔️
- What type of iterator? The same type that you gave it to
indicate the range.
- If it can’t find the value, it returns:
- not a 0 or −1 ✘
- not a null pointer ✘
- not a pointer ✘
- the second iterator given; the end of the half-open interval.
✔️
- OK, technically, if you give find() raw pointers, then it does
return the same type, namely, a pointer.
vector<int> primes{ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97 };
auto it = find(primes.begin(), primes.end(), 13);
if (it == primes.end())
cout << "not found\n";
else
cout << "Found "<< *it << " at " << it-primes.begin() << '\n';
Found 13 at 5
An algorithm; not a method! Some containers have a .find()
method, which is preferred, if it exists.
vector<int> primes{ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97 };
int *start = &primes[3], *end = &primes[20];
cout << "Search the interval [" << *start << ',' << *end << ")\n"
<< "It includes " << *start << ", but not " << *end << '\n';
int *p = find(start, end, 13);
cout << "Found " << *p << " at " << p-&primes[0] << '\n';
p = find(start, end, 89);
cout << "Nope " << *p << " at " << p-&primes[0] << '\n';
Search the interval [7,73)
It includes 7, but not 73
Found 13 at 5
Nope 73 at 20
bool pred(int n) {
return n > 50; // Should find 53
}
int main() {
set<int> primes{ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97 };
if (auto it = find_if(primes.begin(), primes.end(), pred); it == primes.end())
cout << "Failure\n";
else
cout << "Found " << *it << '\n';
}
Found 53
- Note the new if statement with
if (
init;
condition)
- find_if() is like find(), except that it takes a predicate
instead of target value.
string str = "bonehead";
set<char> alpha = {'P', 'D', 'Q'};
copy(alpha.begin(), alpha.end(), str.begin());
cout << str << '\n';
DPQehead
string alpha = "abcdefghijklmnopqrstuvwxyz";
string initials = "JRA";
copy(initials.begin(), initials.begin()+2, alpha.begin()+20);
cout << alpha << '\n';
abcdefghijklmnopqrstJRwxyz
The iterator arguments don’t just have to be .begin()
and
.end()
.
- copy_if() is like copy(), except that it doesn’t copy everything.
- Instead it takes a predicate that determines whether or not
to copy a given element.
- A predicate is a function that returns bool.
- copy_if() takes three iterators and a predicate:
copy_if(
source-begin,
source-end,
dest-begin,
predicate)
Why isn’t there a dest-end?
It isn’t needed. source-begin and source-end say how much to copy.
First attempt
Let’s ensure that we know how to use copy() before moving on to copy_if():
string foo = "I have to ration my Diet Mountain Dew!";
cout << foo << "\n";
string bar;
copy(foo.begin(), foo.end(), bar.begin());
cout << bar << "\n";
SIGSEGV: Segmentation fault
Why did that fail?
There is no space allocated in bar
.
You can’t allocate space by pretending that it exists.
Second attempt
string foo = "I have to ration my Diet Mountain Dew!";
cout << foo << "\n";
string bar;
bar.resize(foo.size());
copy(foo.begin(), foo.end(), bar.begin());
cout << bar << "\n";
I have to ration my Diet Mountain Dew!
I have to ration my Diet Mountain Dew!
Third attempt
string foo = "I have to ration my Diet Mountain Dew!";
cout << foo << "\n";
string bar;
bar.resize(foo.size());
// Don’t copy vowels:
copy_if(foo.begin(), foo.end(), bar.begin(),
[](char c){return "aeiouAEIOU"s.find(c)==string::npos;} );
cout << bar << "\n";
I have to ration my Diet Mountain Dew!
hv t rtn my Dt Mntn Dw!␀␀␀␀␀␀␀␀␀␀␀␀␀␀
Hooray, copy_if() worked!
Hey, what’s with those ␀ characters?
.resize()
filled the string with '\0', which display as ␀ here.
Your terminal may simply ignore them and so not display them.
bar.size()
is unchanged.
Fourth attempt
string foo = "I have to ration my Diet Mountain Dew!";
cout << foo << "\n";
string bar(foo.size(), 'X');
// Don’t copy vowels:
auto it = copy_if(foo.begin(), foo.end(), bar.begin(),
[](char c){return "aeiouAEIOU"s.find(c)==string::npos;} );
// Make bar the correct size:
bar.resize(it-bar.begin());
cout << bar << "\n";
I have to ration my Diet Mountain Dew!
hv t rtn my Dt Mntn Dw!
We resized bar
to the correct size.
In-place
string foo = "I have to ration my Diet Mountain Dew!";
cout << foo << "\n";
auto it = copy_if(foo.begin(), foo.end(), foo.begin(),
[](char c){return "aeiouyAEIOUY"s.find(c)==string::npos;} );
// Make foo the correct size:
foo.resize(it-foo.begin());
cout << foo << "\n";
I have to ration my Diet Mountain Dew!
hv t rtn m Dt Mntn Dw!
We can copy from & to the same location.
string fact = "Warren Harding’s middle name was Gamaliel.";
replace(fact.begin(), fact.end(), ' ', '_');
cout << fact << '\n';
Warren_Harding’s_middle_name_was_Gamaliel.
string fact = "Warren Harding’s middle name was Gamaliel.";
replace_if(fact.begin(), fact.end(),
[](char c) { return c=='o' || c=='a';}, '*');
cout << fact << '\n';
W*rren H*rding’s middle n*me w*s G*m*liel.
string name = "Joseph Robinette Biden Jr.";
string out;
transform(name.begin(), name.end(), out.begin(),
[](char c) { return c ^ 040; });
cout << out << '\n';
SIGSEGV: Segmentation fault
Oops! Didn’t allocate any memory in out
!
string name = "Joseph Robinette Biden Jr.";
string out;
out.resize(name.size());
transform(name.begin(), name.end(), out.begin(),
[](char c) { return c ^ 040; });
cout << out << '\n';
jOSEPH␀rOBINETTE␀bIDEN␀jR␎
The sort() algorithm (from the header file <algorithm>) has two forms:
sort(
begin, end );
sort(
begin, end, comparison-object-or-function);
- Only a single half-open interval is given.
How do I sort() from container1 to container2?
copy() to container2, sort() the data there.
Containers
- Of course, some containers are intrinsically sorted.
- You might specify a comparison functor for those containers.
- You wouldn’t use sort() algorithm on those containers.
- However, you might want to apply the sort() algorithm to
an unsorted container, such as a std::array, vector, string,
or even a C array.
- list has a sort() method.
Default comparison
string s = "Kokopelli";
sort(s.begin(), s.end());
cout << s << '\n';
Keiklloop
Explicit comparison
string s = "Kokopelli";
sort(s.begin(), s.end(), less<char>);
cout << s << '\n';
c.cc:2: error: expected primary-expression before ')' token
- Why doesn’t that work?
- What sort of thing does sort() want for the third argument?
- What sort of thing is
less<>
?
Explicit comparison
string s = "Kokopelli";
sort(s.begin(), s.end(), less<char>());
cout << s << '\n';
Keiklloop
Reverse sort
string s = "Kokopelli";
sort(s.begin(), s.end(), greater<char>());
cout << s << '\n';
poollkieK
Comparison function
bool lt(char a, char b) {
return a < b;
}
int main() {
string s = "Kokopelli";
sort(s.begin(), s.end(), lt);
cout << s << '\n';
}
Keiklloop
λ-function
string s = "Kokopelli";
sort(s.begin(), s.end(),
[](char a, char b){return a<b;});
cout << s << '\n';
sort(s.begin(), s.end(),
[](char a, char b){return a>b;});
cout << s << '\n';
Keiklloop
poollkieK
Case folding
bool lt(char a, char b) {
return toupper(a) < toupper(b);
}
int main() {
string s = "Kokopelli";
sort(s.begin(), s.end(), lt);
cout << s << '\n';
}
eiKklloop
Unique
If you want to avoid duplicates, then use unique().
bool lt(char a, char b) {
return toupper(a) < toupper(b);
}
int main() {
string s = "Kokopelli";
sort(s.begin(), s.end(), lt);
auto it = unique(s.begin(), s.end());
s.resize(it-s.begin());
cout << s << '\n';
}
eiKklop
unique() requires that its input is in order already.
That way, it can run in O(n) time, as opposed to O(n²) time.
Unique
Case-independent uniqueness doesn’t come free:
bool lt(char a, char b) {
return toupper(a) < toupper(b);
}
bool eq(char a, char b) {
return toupper(a) == toupper(b);
}
int main() {
string s = "Kokopelli";
sort(s.begin(), s.end(), lt);
auto it = unique(s.begin(), s.end(), eq);
s.resize(it-s.begin());
cout << s << '\n';
}
eiKlop
Unfortunately, we’ve duplicated the calls to toupper()
.
Unique and DRY
Duplication of code is a bad thing,
but avoiding it sometimes has a cost.
bool lt(char a, char b) {
return toupper(a) < toupper(b);
}
bool eq(char a, char b) {
return !(lt(a,b) || lt(b,a));
}
int main() {
string s = "Kokopelli";
sort(s.begin(), s.end(), lt);
auto it = unique(s.begin(), s.end(), eq);
s.resize(it-s.begin());
cout << s << '\n';
}
eiKlop
Generality
It’s not just about strings:
int a[] = {333, 22, 4444, 1};
sort(begin(a), end(a));
for (auto val : a)
cout << val << '\n';
1
22
333
4444
vector<double> v = {1.2, 0.1, 6.7, 4.555};
sort(v.begin(), v.end(), greater<double>());
for (auto val : v)
cout << val << '\n';
6.7
4.555
1.2
0.1
Why didn’t I say a.begin()
in the first example?
- Because
a
is a C array. It’s not an object—no methods!
- BTW, just plain
a
would have worked as well as begin(a)
.
Attitude
- These algorithms may strike you as simplistic.
- “I could write that!”, you may think.
- Sure, you probably could. But would you get it right?
Even the interesting corner cases, like searching an empty range?
- Also, using standard algorithms conveys meaning. All educated
C++ programmers know what the standard algorithms do.