Value Semantics

NI Tech Talks

Value Semantics

// Stefan Gränitz, Reaktor Dev Team, Berlin 2014 09 09

* name + located in reaktor chamber in 28/3r

overview

theory & terminology
introductory example
objective
usage
- automatic optimization
- live examples
implementation
- regular type
- equivalence in detail
- notes
advanced class design for value semantics

* what we do step by step
* where there is no "advanced" we cover very much the basics
* one advanced design example that shows the beauty of value semantics
* will hopefully be some time left for discussions - open questions

* not so easy to setup a linear introduction, because motivation is very much mixed
* different people argue differently, can cause confusion
* first part kind of question-answer game ∏

theory

what is a value?

* hmm.. hard to say without any context
* lets start with something we all have a proper understanding of ∏

what is an object?

"An object is a region of storage." (§1.8.1, ISO C++11 Standard)
the object type defines the structure of this storage
with corresponding to a unique part of memory, objects implicitly define identity
physical entity

what is a value?

value is the abstract interpretation of what an object represents
values are independent of the underlying physical representation of the object
identity is not important to values, what we care about here is equivalence
logical entity

* we are all familiar with the concept of objects - why? because it's defined!
* sizeof(T) > 0 & in langs like Java an object is something that "has methods"
* so when this is what we understand as an object, what does a value mean then?
* obviously higher level of abstraction when dealing with values

intuitive examples

int a = 1; int b = 1;

⇒ a == b (equivalent)
⇒ &a != &b (not identical)

char a = 1; int b = 1;

⇒ a == b (equivalent)
⇒ &a != &b (not identical)

const char *c = "ab"; std::string s = "ab";

⇒ c == s (equivalent)
⇒ &c != &s (not identical, does not compile)

* first, second obvious example
* last example things get interesting

what we can already see:

equality is less restrictive than identity
→ identical objects are always equal
→ equal objects are not necessarily identical
equality does not necessarily depend on an objects underlying physical representation
→ e.g. equality can be defined between instances of
different types
things are quite simple when dealing with unique values
→ e.g. numbers, strings, dates

what do we mean with value semantics?

a programming style..

where we focus on values that are represented by objects rather than objects themselves
where we pass around values instead of objects whenever we don't need to refer to a special instance

design-wise →

usage-wise →

* certainly two aspects of one and the same thing
* for now ok? it's probably a bit blury but..

example: value vs. object

void scale(std::vector &v, const double &x)
{
  for (size_t i = 0; i != v.size(); ++i)
    v[i] /= x;
}

what do we actually want here?

→ modify v by scaling each of it's elements by some factor x

of cause we don't want to modify a local copy, so we get v by reference and modify whatever instance it refers to
but what is the reason for getting x by reference? probably someone intended to suppress a spare copy?

* probably understandable even wihout documentation :)

is it even correct to get x by reference? NO!

what if we called it that way:

scale(v, v[0])

i = 0 → v[0] = v[0] / v[0] = 1
i > 0 → v[i] = v[i] / 1 = v[i]

passing x by reference introduced a hidden dependency,
causing a side-effect in our example

in general with using references we are referring to remote objects,
making the code more complicated, because it prohibits local reasoning

that's where we go much easier with values!

* whenever we introduce dependencies code gets bit uglier to read + take bit longer to understand
* cause undermine what we call local reasoning

⇓

void scale(std::vector &v, double x)
{
  for (size_t i = 0; i != v.size(); ++i)
    v[i] /= x;
}

objective

why leveraging value semantics?

it results in simpler code, because we don't need to manage object identity
→ improve locality
→ avoid unnecessary dependencies
going towards saying what we want instead of specifying how it is done
→ makes it easier for others to understand our code
it faciliates regular types
→ better code reuse
→ same syntax for the same semantics

* see three components again:
  everyday coding, compiler/language support, class design aspect for building better libs
* the question is... ∏

why do we so often tend to use references,
even if semantically there is no need to?

because references were always known
to be more efficient

→ c++ devs are used to optimize their code manually
→ sometimes on the cost of code simplicity

* must be said again! because using values means copying things around all the time right?
* that was true in the early days of c++

the interesting point is:

with using a reference we declared identity, where we only needed equivalence!
for the compiler this means our declaration is more restrictive! it has less freedom to optimize

with value semantics we get better code
and take advantage of automatic optimization

"tell the compiler that you are interested in values rather than objects, and the compiler chooses the best tool to implement your intentions"
^{(Andrzej Krzemieński)}

* that's kind of the utopia of programming, where only specify what should be done and
  computer itself figures out how to do it
* let's have a look on what is possible today

in practice: automatic optimization

as-if rule: [...] an implementation is free to disregard any requirement of this International Standard as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program.
^{(§1.9.1, ISO C++11 Standard)}

→ in general the compiler can perform arbitrary optimizations as long as it results in the same behavior

copy elision: When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects.
^{(§12.8.31, ISO C++11 Standard)}

→ copies and moves are made “in principle”, the compiler is allowed to optimize them away

→ regarding the as-if rule, copy/move have predefined semantics

copy/move construction should only be implemented explicitly for:

RAII classes to encapsulate resources
implementing optimizations

otherwise the compiler generated ones should be just fine (they simply copy all members on copy and move all members on move)

to be discussed design-wise: prefer rule of zero, rule of five defaults or
defaulted c/d-tors with assign by value?

* does not match so well here, but still better then everywhere else
* getting to the design component again..
* to be discussed.. first open question. if lucky please come and convince me

example: return by value

...

example: passing by value

...

furthermore..

returning values from functions, passing as readonly args and passing r-values as sink arguments doesn't require copying!
for reasonable-sized data structures it's also acceptable to make copies, when performance is not a total must

but!

make sure your type supports value semantics!
→ make first steps with std types

implementation

things are easy as long as we deal with unique values
(like arithmetic values, dates, etc.)

e.g. we have a lot of different representations of the decimal integer 5:
encodings for 5

the value itself is a logical entity
it's independent of which physical representation is used
think of a c++ object as just another representation!

* as long as types are given and known to work as values it's mostly playing around and getting a feeling
* but when desinging them ourselves we're suddenly confronted with some strange questions..
* .. what if we invented an abstract type where we have no unique value as an representation? 
* should have a look on some requirements for implmenting value semantics in a robust fashion

requirements for an abstract data type
to support value semantics

equality-comparable
default-constructible
assignable (⇒ copyable and/or movable)

→ try to model your own types as regular types

A regular type is one that is a model of Assignable, DefaultConstructible, EqualityComparable, and one in which these expressions interact in the expected way. For example, after x = y, we may assume that x == y is true.

* as long as types are given and known to work as values it's mostly playing around and getting a feeling
* but when desinging them ourselves we're suddenly confronted with some strange questions..

in general a type's public interface should ensure this:

if (a == b)
{
  op₁(a); op₁(b); assert(a == b);
  op₂(a); op₂(b); assert(a == b);
  op₃(a); op₃(b); assert(a == b);
  op₄(a); op₄(b); assert(a == b);
}

how to compare for equivalence?

^{(when we don't have unique values)}

in math what we talk about is called an equivalence relation:

a == a (reflexive)
a == b ⇔ b == a (symmetric)
a == b && b == c ⇒ a == c (transitive)

obviously operator== must define an equivalence relation

symmetry ⇒ it's best to define operator== free-standing!
^{for a more technical explanation see: John Lakos, "Value Semantics" (2014), p. 193 ff}

given two std::vector's differing only in their capacity()

are they equivalent?

..let's have a look at some simpler examples first..

example: point

class point_t {
public:
  int x() const;
  int y() const;

private:
  ...
};

bool operator==(const point_t &lhs, const point_t &rhs)
{
  return lhs.x() == rhs.x() && lhs.y() == rhs.y();
}

→ x and y are sometimes referred to as Salient Attributes

example: box

class box_t {
public:
  int width() const;
  int height() const;
  point_t origin() const;

private:
  ...
};

bool operator==(const box_t &lhs, const box_t &rhs)
{
  return lhs.width() == rhs.width() && 
         lhs.height() == rhs.height() &&
         lhs.origin() == rhs.origin();
}

→ typically compound objects are equal
if all their components are equal

→ recursive definition

* so here you already know where we end up right? turns out recursion is very useful
* if used "composition over inheritance" def for equivalence becomes easier, ending up with unique values
* looks like there is always some set of attributes that represents an objects logical state?!

example: rational number

class rational_t {
public:
  int numerator() const;
  int denominator() const;

private:
  ...
};

bool operator==(const rational_t &lhs, const rational_t &rhs)
{
  < implementation? >
}

salient attributes? → ??

how is equality defined here? → n₁ / d₁ == n₂ / d₂

* what we really have is a slightly more complex definition of equality
* just multiply equation by d1 and d2

class rational_t {
public:
  int num() const;
  int denom() const;

private:
  ...
};

bool operator==(const rational_t &lhs, const rational_t &rhs)
{
  long long ext_fraction1 = (long long)lhs.num() * rhs.denom();
  long long ext_fraction2 = (long long)rhs.num() * lhs.denom();

  return ext_fraction1 == ext_fraction2;
}

→ asking for salient attributes can be misleading

so what?

when comparing for equality, we need to ask:

what contributes to an objects logical state
what exists only for technical reasons

→ std::vector::capacity() is there for technical reasons only:

a modified vector could have its capacity always set to its size
it would just reallocate on every push and pop, but this is an implementation detail
its observable behavior could not be distinguished from the original vector

→ std::vector::capacity() is an optimization:

optimizations may be used to overcome drawbacks that arise from certain physical representations
optimizations must not affect an objects logical state
an objects logical state must not depend on optimizations

in other words: when a and b are vectors with different capacities,
we still expect the same behavior!

if (a == b)
{
  op₁(a); op₁(b); assert(a == b);
  op₂(a); op₂(b); assert(a == b);
  op₃(a); op₃(b); assert(a == b);
  op₄(a); op₄(b); assert(a == b);
}

* so with knowing that, what pitfalls should we watch out for? ∏

attention

not every stateful object has a natural value
→ e.g. thread pool, mutex, scoped guard

not every type defines equality
→ e.g. graphs, they define isomorphy which is structural equivalence
→ std::function objects cannot be compared for equality at all

to be discussed: should we explicitly delete respective
operators and constructors to achieve "restricted-regularity"?

* ever tried to copy a thread-pool? we could even invent some definition of value here, 
  but most likely different people expect different defs and be supprised by behavior
* probably not so obvious for the non-math-freaks, but isomorph means structural equivalence,
  where order of nodes and adjacency lists is not important!
* std::function deletes operator== explicitly - that leads to next open question..

in practice: advanced class design for value semantics

there is one issue when working with values all the time..

we don't have polymorphism!
because that requires aliases which requires referenceness

as a solution for this, Sean Parent proposes:
"polymorphism as an implementation detail"

Shifing polymorphism from type to use allows for greater reuse and fewer dependencies
Using regular semantics for the common basis operations, copy, assignment, and move helps to reduce shared objects
Regular types promote interoperability of sofware components, increases productivity as well as quality, security, and performance
There is no necessary performance penalty to using regular semantics, and ofen times there are performance benefts from a decreased use of the heap

Literature & further reading

John Lakos - Value Samentics (2014)
^{https://rawgit.com/boostcon/cppnow_presentations_2014/master/files/accu2015.140518.pdf}
Sean Parent - Value Semantics and Concept Based Polymorphism
^{https://parasol.tamu.edu/people/bs/622-GP/value-semantics.pdf}
Andrzej Krzemieński - Comprehensie Blog Post on Value Semantics
^{http://akrzemi1.wordpress.com/2012/02/03/value-semantics/}
Martinho Fernandes - The Rule of Zero
^{http://flamingdangerzone.com/cxx11/2012/08/15/rule-of-zero.html}
Scott Meyers - The Rule of.. Five Defaults?
^{http://scottmeyers.blogspot.de/2014/03/a-concern-about-rule-of-zero.html}

NI Tech Talks

Value Semantics

// Stefan Gränitz, Reaktor Dev Team, Berlin 2014 09 09

overview

theory

what is a value?

what is an object?

what is a value?

intuitive examples

what do we mean with value semantics?

example: value vs. object

objective

why leveraging value semantics?

in practice: automatic optimization

example: return by value

example: passing by value

furthermore..

but!

implementation

requirements for an abstract data typeto support value semantics

how to compare for equivalence?

example: point

example: box

example: rational number

so what?

attention

in practice: advanced class design for value semantics

Literature & further reading

requirements for an abstract data type
to support value semantics