CS253: Software Development with C++

Spring 2021

Strings

Show Lecture.Strings as a slide show.

CS253 Strings

Operators

Java programmers aren’t used to mutable strings with operators:

string a = "alpha", b = "beta", g("gamma"), parens = "()";
b += "<"+g+'>';
auto result = parens[0]+a+b+parens[1];
result[3] = '*';
cout << result << '\n';
if (b < g) cout << "Good!\n";
(al*habeta<gamma>)
Good!

Old vs. New

// C strings
char q[] = "gamma delta";
char *p = "alpha beta";
printf("%zd chars; first char is %c\n", strlen(q), q[0]);
printf("%zd chars; ninth char is %c\n", strlen(p), p[8]);
11 chars; first char is g
10 chars; ninth char is t
// C++ strings
string s = "ceti alpha six";
cout << s.size() << " chars; third char is " << s[2] << '\n'
     << s.length() << " chars; last char is " << s.back() << '\n';
14 chars; third char is t
14 chars; last char is x

Why C strings?

C strings

char foo[10] = "xyz", bar[] = "pdq";
cout << sizeof(foo) << ' ' << strlen(foo) << '\n';
cout << sizeof(bar) << ' ' << strlen(bar) << '\n';
10 3
4 3

C strings

C++ strings

How NOT to define a C++ string

Welcome to C++, which is not Java:

string riley = new string;
cout << riley;
c.cc:1: error: conversion from 'std::__cxx11::string*' {aka 
   'std::__cxx11::basic_string<char>*'} to non-scalar type 
   'std::__cxx11::string' {aka 'std::__cxx11::basic_string<char>'} requested

How NOT to define a C++ string

Don’t do this, either, though it does work:

string joy = string("sadness");
cout << joy;
sadness

That creates an anonymous temporary string on the right-hand side, copies/moves it to joy, then destroys the temporary string.

Sure, it works, but … no!

How to define a C++ string

Do it like this:

string fear = "disgust";
cout << fear;
disgust

or, if you don’t have a value for the string at first:

string anger;
anger = "Bing Bong";
cout << anger;
Bing Bong

Java programmers are trained to treat objects differently than other types. Shake that off!

Subscripting

Subscripting on a C++ string produces a char:

string course="CS253";
cout << course << '\n';
cout << course[2] << '\n';
CS253
2

which can be modified:

string pet = "cat";
pet[0] = 'r';
cout << pet << '\n';
rat

Note the 'r', not "r".

Mutable

Unlike Java, C++ strings are mutable—they can be modified.

string soup = "Tomato dispue is bisgusting.";
cout << soup << '\n';
soup[7]  = 'b';
soup[10] = 'q';
soup[17] = 'd';
cout << soup << '\n';
Tomato dispue is bisgusting.
Tomato bisque is disgusting.

String methods

The string class has many methods, including:

MethodDescription
string::c_str()Extract C string
string::size()get length
string::insert()add chars anywhere
string::erase()remove chars anywhere
string::replace()replace chars anywhere
string::substr()return substring
string::find()look for a string or character
string::find_first_of()find next char in a set of chars
string::find_first_not_of()find next char not in a set of chars
string::find_last_of()find prev char in a set of chars
string::find_last_not_of()find prev char not in a set of chars

Learn those methods

Truth

I freely use C string literals, like this:

void emit(string s) {
    cout << "*** " << s << '\n';
}

int main() {
    emit("Today is a lovely day.");
    return 0;
}
*** Today is a lovely day.

Some Code

char q[80] = "This is a C string.\n";
cout << q;
char r[] = "foobar";
r[3] = '\0';
cout << "r is now \"" << r << "\"\n";
const char *p = "This is also a C string";
cout << p << ", length is " << strlen(p) << '\n';
This is a C string.
r is now "foo"
This is also a C string, length is 23
string s("useless initial value");
s = "This am a C++ string";     // mixed
s[5] = 'i';                     // mutable
s[6] += 6;              // char is integer-like
cout << s << ", length is " << s.size() << '\n';
This is a C++ string, length is 20

Conversions

Converting from a C-style string to a C++ string is easy, because the C++ string object has a constructor that takes a C-style string:

char chip[] = "chocolate";
string dale(chip);
cout << dale << '\n';
chocolate

Conversions

Converting from a C++ string to a C-style string requires a method:

string wall(30, '#');
const char *p = wall;
cout << p << '\n';
c.cc:2: error: cannot convert 'std::__cxx11::string' {aka 
   'std::__cxx11::basic_string<char>'} to 'const char*' in initialization
string wall(30, '#');
const char *p = wall.c_str();
cout << p << '\n';
##############################

string::c_str()

string::c_str() is useful for calling an old-fashioned library function that wants a C-style string.

string command = "date";
system(command);
c.cc:2: error: cannot convert 'std::__cxx11::string' {aka 
   'std::__cxx11::basic_string<char>'} to 'const char*'
string command = "date";
system(command.c_str());
Thu Nov 21 23:53:19 MST 2024

string::data()

String Literals

made at imgflip.com

Literals

String Literals

A "string literal" is an anonymous array of constant characters. These are equivalent:

cout << "FN-2187";
FN-2187
const char whatever[] = "FN-2187";
cout << whatever;
FN-2187
const char whatever[] = "FN-2187";
const char *p = &whatever[0];
cout << p;
FN-2187

Escape Sequences:

SequenceMeaningSequenceMeaning
\abell\''
\bbackspace\""
\fform feed\\\
\nnewline\0ddd0–3 octal digits
\rcarriage return\xdd1–∞ hex digits
\thorizontal tab\uddddUnicode U+dddd
\vvertical tab\UddddddddUnicode U+dddddddd

String Pasting

Two adjacent string literals are merged into one at compile-time:

cout << "alpha beta "  "gamma delta "
        "epsilon\n";
alpha beta gamma delta epsilon
cout << "Business plan:\n\n"
        "1. Collect underpants\n"
        "2. ?\n"
        "3. Profit\n";
Business plan:

1. Collect underpants
2. ?
3. Profit

Raw Strings

Raw Strings

A raw string starts with R"( and ends with )". The parens are not part of the string.

cout << R"(Don’t be "afraid" of letters:
\a\b\c\d\e\f\g)";
Don’t be "afraid" of letters:
\a\b\c\d\e\f\g

Cool! Quotes inside of quotes!

However …

What if the string contains a right paren? I want to emit:

    A goatee!  :-)"  Cool!
cout << "A goatee!  :-)"  Cool!";
c.cc:1: warning: missing terminating " character
c.cc:1: error: missing terminating " character

That didn’t work. The )" at the bottom of the face was taken to be the end of the raw string.

Solution

A raw string starts with:

R"whatever-you-like-up-to-sixteen-chars(

and ends with:

)the-same-up-to-sixteen-chars"
cout << R"X(A goatee!  :-)"  Cool!)X";
A goatee!  :-)"  Cool!
cout << R"WashYourHair(What the #"%'&*)?)WashYourHair";
What the #"%'&*)?
cout << R"(The degenerate case)";
The degenerate case

Comparing C-Style Strings

if ("foo" < "bar")
    cout << "😢";
c.cc:1: warning: comparison with string literal results in unspecified behavior
😢

Comparing C-style strings properly.

Comparing C++ std::strings

    <  >  <=  >=  ==  !=

Example

string name = "Conan O’Brien";
if (name == "Conan O’Brien")
    cout << "good 1\n";
if (name < "Zulu")
    cout << "good 2\n";
if (name > "Andy Richter")
    cout << "good 3\n";
if (name == name)
    cout << "good 4\n";
good 1
good 2
good 3
good 4

God help us, another string!

C++17’s string_view is a non-owning read-only view into a C-string or std::string. It’s generally implemented as a char * and a length.

const char *a = "alpha";
string b = "beta";
string_view c = a;
cout << c << '\n';
c = b;
cout << b << '\n';
alpha
beta

string_view purpose

void hero(string_view sv) {
    cout << "Nice work, " << sv << "!"
         << " (len=" << sv.size() << ")\n";
}

int main() {
    hero("Batman"); // C-string
    hero("Robin"s); // C++ string
}
Nice work, Batman! (len=6)
Nice work, Robin! (len=5)

Methods

Timing: converting to const string reference

bool first(const string &csr) { return csr[0]; }

int main() {
    const char s[] = "abcdefghijklmnopqrstuvwxyz";
    for (int i=0; i<10'000'000; i++)
        first(s);
}

Real time: 136 ms

bool first(const string &csr) { return csr[0]; }

int main() {
    string s = "abcdefghijklmnopqrstuvwxyz";
    for (int i=0; i<10'000'000; i++)
        first(s);
}

Real time: 5.79 ms

Timing: converting to string_view

bool first(string_view sv) { return sv[0]; }

int main() {
    const char s[] = "abcdefghijklmnopqrstuvwxyz";
    for (int i=0; i<10'000'000; i++)
        first(s);
}

Real time: 4.77 ms

bool first(string_view sv) { return sv[0]; }

int main() {
    string s = "abcdefghijklmnopqrstuvwxyz";
    for (int i=0; i<10'000'000; i++)
        first(s);
}

Real time: 5.17 ms