Show Lecture.CopyElision as a slide show.
CS253 Copy Elision
“elision”
- rhymes with “vision”
- “removal”
- “omission”
- “elimination”
- “suppression”
The Loud Class
The following examples use our class Loud,
which displays a message for every method invoked.
int main() {
Loud alpha;
}
Loud::Loud()
Loud::~Loud()
No surprise, here. alpha
got created & destroyed.
More Construction
int main() {
Loud beta, gamma;
}
Loud::Loud()
Loud::Loud()
Loud::~Loud()
Loud::~Loud()
Sure.
Copy ctor & assignment
int main() {
Loud delta;
Loud epsilon(delta);
delta=epsilon;
}
Loud::Loud()
Loud::Loud(const Loud &)
Loud::operator=(const Loud &)
Loud::~Loud()
Loud::~Loud()
As expected.
More copying
Loud foo() {
Loud zeta;
return zeta;
}
int main() {
Loud eta(foo());
}
Loud::Loud()
Loud::~Loud()
- Hey, wait. Where’s the copy ctor?
- Two distinct objects,
zeta
and eta
, should have been created.
- They weren’t.
Ludicrous copying
int main() {
Loud theta = Loud(Loud(Loud(Loud(Loud(Loud(Loud(Loud())))))));
}
Loud::Loud()
Loud::~Loud()
There should be more than that.
Elision
Loud foo() {
Loud iota;
return iota;
}
int main() {
Loud kappa(foo());
}
Loud::Loud()
Loud::~Loud()
- The compiler is allowed to elide
(rhymes with “ride”: suppress, omit) ctors, copy ctors, and dtors,
even if the methods have side effects, like the
Loud
class.
- As of C++ 2017, copy elision is required in the
zeta
& theta
examples.
- Another way to view it: the compiler can assume that ctors, copy
ctors, and dtors only do
creation/copying/destruction,
even if they don’t (as in
Loud
).
- This is efficient. Copying takes time & space.
- There is no variable
iota
.
Instead, the ctor in foo()
constructs directly in kappa
.
No copying; it’s built in the right place.
Parameter Passing
- For this to make sense, we have to understand how parameters & results
are passed to & from functions.
- How does something get passed as an argument?
- How does a a function return an int?
No Answers Yet
- The answer, of course, depends on the cpu architecture.
- I first programmed on an
IBM/360,
which had no stack.
- However, in general, it can be assumed that registers are
faster than memory.
- Therefore, we prefer passing arguments in registers.
- However, there are only so many registers.
- What if there are more arguments than available registers?
- Usually, the extra arguments go onto the stack.
X86 Answers
The popular
X86-64 CPU architecture
has sixteen more-or-less general-purpose 64-bit registers, used for:
- Function arguments:
- Scalars (int, double, pointers) are passed in registers
- A reference is implemented as a pointer.
- A C array is a special case, always passed by reference.
- An object passed by value gets pushed onto the stack.
- Methods get an additional this pointer argument.
- If we run out of registers, push arguments onto the stack.
- Return value:
- Scalars (int, double, pointers) are returned in a register.
- An object returned by value is written through a pointer to
a hidden parameter.
Parameter Passing
When a function returns an non-scalar object,
compilers typically pass a hidden argument to the result.
// User code:
string tv() {
string s="SNL";
return s;
}
int main() {
string fav=tv();
cout << fav;
}
SNL
// Implementation:
void tv(string &out) {
string s="SNL";
out=s;
}
int main() {
string result;
tv(result);
string fav=result;
cout << fav;
}
SNL
// After optimization:
void tv(string &out) {
out="SNL";
}
int main() {
string fav;
tv(fav);
cout << fav;
}
SNL
All thanks to copy elision !