D2848R0
std::is_uniqued

Draft Proposal,

Authors:
Audience:
EWG
Project:
ISO/IEC JTC1/SC22/WG21 14882: Programming Language — C++
Draft Revision:
3
Current Source:
github.com/Quuxplusone/draft/blob/gh-pages/d2848-std-is-uniqued.bs
Current:
rawgit.com/Quuxplusone/draft/gh-pages/d2848-std-is-uniqued.html

Abstract

The STL provides the triples (std::sort, std::is_sorted_until, std::is_sorted) and (std::make_heap, std::is_heap_until, std::is_heap) and (std::unique, std::adjacent_find, [blank]). We fill in the blank by proposing std::is_uniqued.

1. Changelog

2. Motivation and solution

The "classic STL" provides the following sets of related algorithms:

And these, too:

The missing algorithm is what would fill in the blank in the following invariant assertions:

std::set<K> s = { ... };
std::multiset<K> ms = { ... };
std::map<K, V> m = { ... };
std::multimap<K, V> mm = { ... };
std::unordered_set<K> us = { ... };
std::unordered_map<K, V> um = { ... };

template<class M, class P = M::value_type>
auto ValueEq(const M& m) {
    return [eq = m.key_eq()](const P& a, const P& b) {
        return eq(a.first, b.first);
    };
}

assert(std::is_sorted(s.begin(), s.end(), s.value_comp()));
assert(std::is_sorted(ms.begin(), ms.end(), ms.value_comp()));
assert(std::is_sorted(m.begin(), m.end(), m.value_comp()));
assert(std::is_sorted(mm.begin(), mm.end(), mm.value_comp()));
assert(std::is______ed(s.begin(), s.end(), std::not_fn(s.value_comp())));
assert(std::is______ed(m.begin(), m.end(), std::not_fn(m.value_comp())));
assert(std::is______ed(us.begin(), us.end(), us.key_eq()));
assert(std::is______ed(um.begin(), um.end(), ValueEq(um)));

We propose that this algorithm should exist, and should be spelled std::is_uniqued.

2.1. Prior art in other languages

The C++ STL verb std::unique is derived from Unix’s uniq filter, which filters adjacent matching lines from the input.

The Swift language provides a transformation named sorted (corresponding to C++'s std::sort), and a transformation named uniqued. The latter doesn’t quite correspond to the C++/Unix notion of "uniquing": it removes even non-adjacent duplicates, by putting all the elements into a hashset and then pulling them out again. Still, this is prior art for the idea of treating "unique, uniqued, uniquing" as a verb.

let animals = ["dog", "pig", "cat", "ox", "dog", "cat"]
let u = Array(animals.uniqued())
// 'u' is now ["dog", "pig", "cat", "ox"]

2.2. Should we require pred to be an equivalence relation?

Currently, std::unique has undefined behavior if you pass it a predicate that is not an equivalence relation. Fortunately, std::adjacent_find does not.

For example, the author of std::flat_set might write:

container_type c;
key_compare compare;
std::sort(c.begin(), c.end(), compare);
c.erase(std::unique(c.begin(), c.end(), std::not_fn(compare)), c.end()); // UB
assert(std::is_sorted(c.begin(), c.end(), compare));
assert(std::adjacent_find(c.begin(), c.end(), std::not_fn(compare)) == c.end()); // OK

Both before and after this proposal, the line marked "UB" has undefined behavior (although it will generally work in practice). We don’t propose to change the precondition of std::unique. The line marked "OK" has well-defined behavior. we don’t propose to change the precondition of std::adjacent_find either.

After this proposal, the following shorter line will be equivalent to the line marked "OK":

assert(std::is_uniqued(c.begin(), c.end(), std::not_fn(compare))); // OK

That is, the new algorithm std::is_uniqued will have the same precondition as the existing std::adjacent_find algorithm (in terms of which it is defined). This differs from the precondition of std::unique, which is a little surprising, but we believe it’s the correct choice.

3. Implementation experience

Arthur has implemented is_uniqued and ranges::is_uniqued in his fork of libc++; see commit 490536e. This implementation is available on Godbolt Compiler Explorer.

4. Proposed wording relative to C++23

Add a new section after [alg.adjacent.find].

Note: This wording is copied straight from [is.sorted], with these mechanical replacements: is_sorted becomes is_uniqued, is_sorted_until becomes adjacent_find, Compare comp becomes BinaryPredicate pred (cf. [alg.adjacent.find]), indirect_strict_weak_order becomes indirect_binary_predicate (ditto), and ranges::less becomes ranges::equal_to (ditto).

4.1. is_uniqued [alg.is.uniqued]

template<class ForwardIterator>
  constexpr bool is_uniqued(ForwardIterator first, ForwardIterator last);

Effects: Equivalent to: return adjacent_find(first, last) == last;

template<class ExecutionPolicy, class ForwardIterator>
  bool is_uniqued(ExecutionPolicy&& exec,
                  ForwardIterator first, ForwardIterator last);

Effects: Equivalent to: return adjacent_find(std::forward<ExecutionPolicy>(exec), first, last) == last;

template<class ForwardIterator, class BinaryPredicate>
  constexpr bool is_uniqued(ForwardIterator first, ForwardIterator last,
                            BinaryPredicate pred);

Effects: Equivalent to: return adjacent_find(first, last, pred) == last;

template<class ExecutionPolicy, class ForwardIterator, class BinaryPredicate>
  bool is_uniqued(ExecutionPolicy&& exec,
                  ForwardIterator first, ForwardIterator last,
                  BinaryPredicate pred);

Effects: Equivalent to: return adjacent_find(std::forward<ExecutionPolicy>(exec), first, last, pred) == last;

template<forward_iterator I, sentinel_for<I> S, class Proj = identity,
         indirect_binary_predicate<projected<I, Proj>,
                                   projected<I, Proj>> Pred = ranges::equal_to>
  constexpr bool ranges::is_uniqued(I first, S last, Pred pred = {}, Proj proj = {});
template<forward_range R, class Proj = identity,
         indirect_binary_predicate<projected<iterator_t<R>, Proj>,
                                   projected<iterator_t<R>, Proj>> Pred = ranges::equal_to>
  constexpr bool ranges::is_uniqued(R&& r, Pred pred = {}, Proj proj = {});

Effects: Equivalent to: return ranges::adjacent_find(first, last, pred, proj) == last;