CS253: Software Development with C++

Fall 2020

Algorithms

Show Lecture.Algorithms as a slide show.

CS253 Algorithms


First published computer algorithm, by Ada Lovelace

Definition

Arguments

count()

vector<int> v = {1, 1,2, 1,2,3, 1,2,3,4};
cout << count(v.begin(), v.end(), 2) << '\n'
     << count(v.begin(), v.end(), 3.0) << '\n';
3
2

count_if()

bool small(int n) {
    return n < 5;
}

int main() {
    multiset<int> ms = {3,1,4,1,5,9,2,6,5,3,5,8,9,7,9};
    cout << count_if(ms.begin(), ms.end(), small) << '\n'
         << count_if(ms.begin(), ms.end(), [](int n){return n>=8;}) << '\n';
}
6
4

count_if() is like count(), except it takes a predicate (a function that returns a bool) instead of a target value.

find()

find()

vector<int> primes{ 2,  3,  5,  7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
                   43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97 };

auto it = find(primes.begin(), primes.end(), 13);
if (it == primes.end())
    cout << "not found\n";
else
    cout << "Found "<< *it << " at " << it-primes.begin() << '\n';
Found 13 at 5

An algorithm; not a method! Some containers have a .find() method, which is preferred, if it exists.

find()

vector<int> primes{ 2,  3,  5,  7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
                   43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97 };

int *start = &primes[3], *end = &primes[20];

cout << "Search the interval [" << *start << ',' << *end << ")\n"
     << "It includes " << *start << ", but not " << *end << '\n';

int *p = find(start, end, 13);
cout << "Found " << *p << " at " << p-&primes[0] << '\n';

p = find(start, end, 89);
cout << "Nope " << *p << " at " << p-&primes[0] << '\n';
Search the interval [7,73)
It includes 7, but not 73
Found 13 at 5
Nope 73 at 20

find_if()

bool pred(int n) {
    return n > 50;                  // Should find 53
}

int main() {
    set<int> primes{ 2,  3,  5,  7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
                    43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97 };
    if (auto it = find_if(primes.begin(), primes.end(), pred); it == primes.end())
        cout << "Failure\n";
    else
        cout << "Found " << *it << '\n';
}
Found 53

copy()

string str = "bonehead";
set<char> alpha = {'P', 'D', 'Q'};
copy(alpha.begin(), alpha.end(), str.begin());
cout << str << '\n';
DPQehead
string alpha = "abcdefghijklmnopqrstuvwxyz";
string initials = "JRA";
copy(initials.begin(), initials.begin()+2, alpha.begin()+20);
cout << alpha << '\n';
abcdefghijklmnopqrstJRwxyz

The iterator arguments don’t just have to be .begin() and .end().

copy_if()

Why isn’t there a dest-end?

It isn’t needed. source-begin and source-end say how much to copy.

First attempt

Let’s ensure that we know how to use copy() before moving on to copy_if():

string foo = "I have to ration my Diet Mountain Dew!";
cout << foo << "\n";
string bar;
copy(foo.begin(), foo.end(), bar.begin());
cout << bar << "\n";
SIGSEGV: Segmentation fault
Why did that fail?

There is no space allocated in bar. You can’t allocate space by pretending that it exists.

Second attempt

string foo = "I have to ration my Diet Mountain Dew!";
cout << foo << "\n";
string bar;
bar.resize(foo.size());
copy(foo.begin(), foo.end(), bar.begin());
cout << bar << "\n";
I have to ration my Diet Mountain Dew!
I have to ration my Diet Mountain Dew!

Third attempt

string foo = "I have to ration my Diet Mountain Dew!";
cout << foo << "\n";
string bar;
bar.resize(foo.size());
// Don’t copy vowels:
copy_if(foo.begin(), foo.end(), bar.begin(),
        [](char c){return "aeiouAEIOU"s.find(c)==string::npos;} );

cout << bar << "\n";
I have to ration my Diet Mountain Dew!
 hv t rtn my Dt Mntn Dw!␀␀␀␀␀␀␀␀␀␀␀␀␀␀

Hooray, copy_if() worked!

Hey, what’s with those ␀ characters?

.resize() filled the string with '\0', which display as ␀ here. Your terminal may simply ignore them and so not display them. bar.size() is unchanged.

Fourth attempt

string foo = "I have to ration my Diet Mountain Dew!";
cout << foo << "\n";

string bar(foo.size(), 'X');
// Don’t copy vowels:
auto it = copy_if(foo.begin(), foo.end(), bar.begin(),
                  [](char c){return "aeiouAEIOU"s.find(c)==string::npos;} );

// Make bar the correct size:
bar.resize(it-bar.begin());
cout << bar << "\n";
I have to ration my Diet Mountain Dew!
 hv t rtn my Dt Mntn Dw!

We resized bar to the correct size.

In-place

string foo = "I have to ration my Diet Mountain Dew!";
cout << foo << "\n";

auto it = copy_if(foo.begin(), foo.end(), foo.begin(),
                  [](char c){return "aeiouyAEIOUY"s.find(c)==string::npos;} );
// Make foo the correct size:
foo.resize(it-foo.begin());
cout << foo << "\n";
I have to ration my Diet Mountain Dew!
 hv t rtn m Dt Mntn Dw!

We can copy from & to the same location.

replace()

string fact = "Warren Harding’s middle name was Gamaliel.";
replace(fact.begin(), fact.end(), ' ', '_');
cout << fact << '\n';
Warren_Harding’s_middle_name_was_Gamaliel.
string fact = "Warren Harding’s middle name was Gamaliel.";
replace_if(fact.begin(), fact.end(),
           [](char c) { return c=='o' || c=='a';}, '*');
cout << fact << '\n';
W*rren H*rding’s middle n*me w*s G*m*liel.

transform()

string name = "Joseph Robinette Biden Jr.";
string out;
transform(name.begin(), name.end(), out.begin(),
          [](char c) { return c ^ 040; });
cout << out << '\n';
SIGSEGV: Segmentation fault

Oops! Didn’t allocate any memory in out!

string name = "Joseph Robinette Biden Jr.";
string out;
out.resize(name.size());
transform(name.begin(), name.end(), out.begin(),
          [](char c) { return c ^ 040; });
cout << out << '\n';
jOSEPH␀rOBINETTE␀bIDEN␀jR␎

sort()

The sort() algorithm (from the header file <algorithm>) has two forms:

How do I sort() from container1 to container2?

copy() to container2, sort() the data there.

Containers

Default comparison

string s = "Kokopelli";
sort(s.begin(), s.end());
cout << s << '\n';
Keiklloop

Explicit comparison

string s = "Kokopelli";
sort(s.begin(), s.end(), less<char>);
cout << s << '\n';
c.cc:2: error: expected primary-expression before ')' token

Explicit comparison

string s = "Kokopelli";
sort(s.begin(), s.end(), less<char>());
cout << s << '\n';
Keiklloop

Reverse sort

string s = "Kokopelli";
sort(s.begin(), s.end(), greater<char>());
cout << s << '\n';
poollkieK

Comparison function

bool lt(char a, char b) {
    return a < b;
}

int main() {
    string s = "Kokopelli";
    sort(s.begin(), s.end(), lt);
    cout << s << '\n';
}
Keiklloop

λ-function

string s = "Kokopelli";
sort(s.begin(), s.end(),
     [](char a, char b){return a<b;});
cout << s << '\n';
sort(s.begin(), s.end(),
     [](char a, char b){return a>b;});
cout << s << '\n';
Keiklloop
poollkieK

Case folding

bool lt(char a, char b) {
    return toupper(a) < toupper(b);
}

int main() {
    string s = "Kokopelli";
    sort(s.begin(), s.end(), lt);
    cout << s << '\n';
}
eiKklloop

Unique

If you want to avoid duplicates, then use unique().

bool lt(char a, char b) {
    return toupper(a) < toupper(b);
}

int main() {
    string s = "Kokopelli";
    sort(s.begin(), s.end(), lt);
    auto it = unique(s.begin(), s.end());
    s.resize(it-s.begin());
    cout << s << '\n';
}
eiKklop

unique() requires that its input is in order already. That way, it can run in O(n) time, as opposed to O(n²) time.

Unique

Case-independent uniqueness doesn’t come free:

bool lt(char a, char b) {
    return toupper(a) < toupper(b);
}

bool eq(char a, char b) {
    return toupper(a) == toupper(b);
}

int main() {
    string s = "Kokopelli";
    sort(s.begin(), s.end(), lt);
    auto it = unique(s.begin(), s.end(), eq);
    s.resize(it-s.begin());
    cout << s << '\n';
}
eiKlop

Unfortunately, we’ve duplicated the calls to toupper().

Unique and DRY

Duplication of code is a bad thing, but avoiding it sometimes has a cost.

bool lt(char a, char b) {
    return toupper(a) < toupper(b);
}

bool eq(char a, char b) {
    return !(lt(a,b) || lt(b,a));
}

int main() {
    string s = "Kokopelli";
    sort(s.begin(), s.end(), lt);
    auto it = unique(s.begin(), s.end(), eq);
    s.resize(it-s.begin());
    cout << s << '\n';
}
eiKlop

Generality

It’s not just about strings:

int a[] = {333, 22, 4444, 1};
sort(begin(a), end(a));
for (auto val : a)
    cout << val << '\n';
1
22
333
4444
vector<double> v = {1.2, 0.1, 6.7, 4.555};
sort(v.begin(), v.end(), greater<double>());
for (auto val : v)
    cout << val << '\n';
6.7
4.555
1.2
0.1
Why didn’t I say a.begin() in the first example?
  • Because a is a C array. It’s not an object—no methods!
  • BTW, just plain a would have worked as well as begin(a).

Attitude