Date: 2014-08-10

Thomas Köppe <tkoeppe@google.com>

Allocator example: A fancy pointer

Part of the Visitor’s Guide to C++ Allocators

An offset pointer

We will design a non-trivial fancy pointer, which can be used as the pointer type of an allocator. The specific example we will develop implements an offset pointer. The idea is for this pointer to store the offset of an object’s address and the address of the pointer itself.

The utility of such a representation lies in the fact that it is to some extent “position-independent”, and under suitable circumstances, offset pointers may be used to refer to the same object from multiple separate sets of memory, e.g. two distinct memory maps of some common memory, or shared memory viewed from different processes. The extent to which this approach is feasible is platform-dependent, since C++ does not have a notion of “process” or “memory mapping”, but it is applicable on a wide variety of popular platforms. We will concentrate on the standard-conforming part of the design, though, and only briefly discuss actual use and platform-dependent assumptions.

Usage

To ground the example in reality, let us jump ahead and consider how we will use the fancy pointer. Suppose we have an allocator A<T> whose pointer type is our fancy pointer. We will want to be able to use this allocator throughout our data structures. As explained in the main guide, we need an allocator adaptor for this purpose:

template <typename T> using AA = std::scoped_allocator_adaptor<A<T>>;

Now we can build a container:

using astring = std::basic_string<char, std::char_traits<char>, AA<char>>; using avector = std::vector<astring, AA<astring>> std::map<int, avector, std::less<int>, AA<std::pair<int const, avector>>>;

If we place one of these maps from integers to vectors of strings into some shared memory and communicate its address to all the participating processes, each process will be able to access and modify the container naturally.

Implementation overview

The full implementation of the offset pointer is available as complete source code. There is also a small example allocator that uses the fancy pointer. We will only display and discuss selected features of the pointer implementation here.

We will call our fancy pointer template OffPtr.

There are three main design issues that we need to consider.

The internal representation is the most interesting problem. The other requirements add a fair amount of boilerplate code. It is necessary to meet all the requirements (e.g. so that std::vector can use OffPtr<T> as an iterator. However, we will not spell out all the details of the implementation.

Internal representation

Offsets

Conceptually, the offset pointer should store the address of an object as the difference between its own address and the object’s address. However, pointers are not generally comparable or subtractable in C++, unless they all point to subobjects of a common complete object. But fancy pointers need to be freely copyable (e.g. in order to satisfy the random access iterator requirements), so it must be possible to have temporary objects, which cannot be subobjects of any given object.

So instead of actual pointers we will use integers. This requires the existence of an integral type with the property that any object pointer can be converted to this integer type and back and produce the same value. Such an integer type is provided optionally by the C++ <cstding> library under the name std::uintptr_t. We will only provide our fancy pointer for platforms that provide this integer type.

The C++ standard makes the following guarantees. As always, let T be an unqualified object type. Then:

Therefore, we choose the following implementation: For any native pointer p, we compute an integral value:

n = reinterpret_cast<std::uintptr_t>(static_cast<void *>(p))

This value computes equal to the original pointer when converted back. Let us denote this conversion generically by INT(p) and the corresponding reverse conversion by PTR(n). With this notation, we can write the standard guarantee succinctly as “PTR(INT(p)) == p for all object pointers p”.

The actual offset pointer stores the difference INT(p) - INT(this) for a given native pointer T * p. When we copy or assign offset pointers, the new offset can be obtained as the old offset adjusted by INT(&lhs) - INT(&rhs).

Null pointers

One final question is how the null pointer should be represented. There is no one right answer, and using the zero offset will certainly not suffice, since a pointer whose value is its own address is frequently used in container design. It has proven feasible in practice to use a dedicated, special offset value to represent the null pointer. The value std::uintptr_t(-1) usually works well.

The special treatment of the null pointer offset means that most pointer operations require a conditional check. Implementers may look for platform-specific, low-level optimisations that reduce the costs of the branching. We will not concern ourselves with such details.

Void pointers

We require a specialisation of the offset pointer for void and const void, which are slightly different, as they do not (and indeed cannot) satisfy the random access iterator requirements. All fancy object pointers need to be convertible to fancy void pointers and const void pointers in the expected fashion.

To avoid code duplication, we implement the offset logic in a base class, called OffPtrBase.

NullablePointer and random access iterator requirements

For the NullablePointer requirements, we have to provide a default constructor that initialises the offset to the special null pointer value. Note that this means that our fancy pointer can never have a trivial constructor. Container authors should bear this in mind and not make overly restrictive triviality assumptions on the pointer types. Moreover, the fancy pointer has to be constructible and convertible from and comparable with nullptr, and convertible to bool.

For the random access iterator requirements we have to provide a few type members, most notably the iterator_category member. We must also provide all the relational operators. The relational operators are all implemented in terms of the corresponding operators on the native pointers, so they may in general only be applied to pointers to subobjects of a common complete object.

Other requirements, defects and limitations

The C++11 standard is somewhat defective regarding the allocator requirements. For instance, it does not answer whether OffPtr<T> and OffPtr<U> should be mutually convertible if T and U are (e.g. base and derived). (The unsafe conversion through OffPtr<void> is not generally correct, since it does not implement derived-to-base conversion.)

We will equip our fancy pointer with a constructor from OffPtr<U> whenever U * is convertible to T *, as this turns out to be required by implementations of node-based containers. We also take care to provide relational operators not just for a deduced argument OffPtr<T>, but also for non-deduced arguments, so that two different fancy pointer types may be compared. This subsumes comparison with null pointers.

Implementation

We start with the offset data member and the special null value:

template <typename T> class OffPtrBase { protected: std::uintptr_t offset; static std::uintptr_t const null_value = static_cast<std::uintptr_t>(-1);

Next, we provide a public interface to check for nullness and obtain the native pointer, implemented in terms of a few helper functions for the offset computations. The add_offset helper is only needed to recover the native pointer from this and offset in get(), but writing it as a separate function is cleaner and avoids an explicit conversion to void *. The crement helper is used by the iterator operations; note that it implemented purely in terms of native pointer arithmetic. (We never attempt to express pointer arithmetic via offset arithmetic, which would not have guaranteed semantics under the standard.)

static T * add_offset(void const * p, std::uintptr_t n) noexcept { return static_cast<T *>(reinterpret_cast<void *>(reinterpret_cast<std::uintptr_t>(p) + n)); } static std::uintptr_t get_offset(void const * from, void const * to) noexcept { return reinterpret_cast<std::uintptr_t>(to) - reinterpret_cast<std::uintptr_t>(from); } std::uintptr_t crement(std::ptrdiff_t n) const noexcept { return get_offset(this, get() + n); } public: bool null() const noexcept { return offset == null_value; } T * get() const noexcept { return null() ? nullptr : add_offset(this, offset); } explicit operator bool() const noexcept { return !null(); }

Next up are the constructors: Default, from native, from copy, from OffPtr<cv T>, and from other offset pointers of compatible types. We don't handle void at this point, which we leave to a conversion operator instead. Note that offsets are always computed from native pointers. We never make assumptions about the meaning of the numeric differences INT(&lhs) - INT(&rhs).

OffPtrBase() noexcept : offset(null_value) { } OffPtrBase(T * native) noexcept : offset(native == nullptr ? null_value : get_offset(this, native)) { } OffPtrBase(OffPtrBase const & rhs) noexcept : offset(rhs.null() ? null_value : get_offset(this, rhs.get())) { } // Conversion from unqualified versions template <typename U, typename = typename std::enable_if<!std::is_same<T, U>::value && std::is_same<typename std::remove_cv<T>::type, U>::value>::type> OffPtrBase(OffPtrBase<U> const & rhs) noexcept : offset(rhs.null() ? null_value : get_offset(this, rhs.get())) { } // Conversion from convertible versions template <typename U, typename Dummy = void, typename = typename std::enable_if<!std::is_same<typename std::remove_cv<T>::type, typename std::remove_cv<U>::type>::value && !std::is_void<U>::value, decltype(static_cast<T *>(std::declval<U *>()))>::type> OffPtrBase(OffPtrBase<U> const & rhs) noexcept : offset(rhs.null() ? null_value : get_offset(this, static_cast<T *>(rhs.get()))) { }

Finally the assignment operator. It needs an overload for std::nullptr_t to avoid ambiguity.

OffPtrBase & operator=(OffPtrBase const & rhs) noexcept { if (this != &rhs) { offset = rhs.null() ? null_value : get_offset(this, rhs.get()); } return *this; } OffPtrBase & operator=(std::nullptr_t) noexcept { offset = null_value; return *this; }

With all the offset computations handled in the base class, we can now provide the actual offset pointer class. We need specialisations for the void types, and we start with them, since we will need their definitions in the specialisation for object types. The use of inherited constructors makes the implementation very succinct.

template <typename T> struct OffPtr; template <> struct OffPtr<void> : OffPtrBase<void> { OffPtr() = default; using OffPtrBase<void>::OffPtrBase; using OffPtrBase<void>::operator=; }; template <> struct OffPtr<void const> : OffPtrBase<void const> { OffPtr() = default; using OffPtrBase<void const>::OffPtrBase; using OffPtrBase<void const>::operator=; };

Now that we have void pointers, we can implement object pointers.

template <typename T> struct OffPtr : OffPtrBase<T> { OffPtr() = default; using OffPtrBase<T>::OffPtrBase; using OffPtrBase<T>::operator=; explicit OffPtr(OffPtr<void> const & rhs) noexcept : OffPtrBase<T>(rhs.null() ? nullptr : static_cast<T *>(rhs.get())) { } explicit OffPtr(OffPtr<const void> const & rhs) noexcept : OffPtrBase<T>(rhs.null() ? nullptr : static_cast<T const *>(rhs.get())) { } operator OffPtr<void>() const noexcept { return { this->get() }; } operator OffPtr<void const>() const noexcept { return { this->get() }; } T * operator->() const noexcept { return this->get(); } T & operator*() const noexcept { return *this->get(); } T & operator[](std::size_t i) const noexcept { return this->get()[i]; }

Here is a small selection of further required interfaces.

// For pointer traits static OffPtr pointer_to(T & x) { return OffPtr(std::addressof(x)); } // For random access iterator requirements using iterator_category = std::random_access_iterator_tag; OffPtr & operator+=(std::ptrdiff_t n) { this->offset = this->crement(n); return *this; } // many more }; // operator{==, !=, <, <=, >, >=}

The complete implementation is available as a separate file.

Platform-dependent use: Mapped memory, IPC

We have been careful to write our offset pointer entirely in terms of standard-defined behaviour. In doing so we have produced something that is correct but perfectly useless. The true power of the offset pointer lies in its ability to address the same object from within different sets of memories. This is only possible if the numeric pointer values are meaningful as actual sequential addresses in a linear address space. More precisely, we require what we may call a linear addressing condition:

p + n == PTR(INT(p) + n * sizeof(*p))

Or expressed differently, after subtracting:

INT(p + n) - INT(p) == n * sizeof(*p)

This condition is satisfied by many popular architectures, at least for all objects within some common “segment”. This would for example include all native objects on x86-64, or objects located inside the same mapped memory segment in a Posix shared-memory setting.

It is sufficient that the linear addressing condition applies to only those fancy pointers and their pointees which are communicated across segment boundaries, such as the data members of an std::vector. By contrast, the condition may clearly be violated by temporary objects created when returning iterators of such vectors, and such temporary objects must therefore not be communicated across segment boundaries.

For example, suppose have a map m as in the introduction, which is located in some shared memory. Then we cannot communicate the address of m itself to a different process with our offset pointer. Instead, we need some shared-memory specific mechanism (presumably known to the concrete shared-memory allocator), such as a global shared memory identifier and an offset. The existence of operations that provide this information is as platform-dependent as the linear addressing condition, and presumably equivalent to it. In other words, if a platform offers a shared memory service, then it will probably have some kind of linear addressing that allows two processes to agree on the meaning of an address in the shared memory.

There will be a separate end-to-end example on designing a shared memory allocator.