Chapter 10. Containers, Iterators, and Algorithms

Containers (sometimes called collections) are a staple of computer programming. Every major programming language has fundamental containers, such as arrays or lists. Modern programming languages usually have an assortment of more powerful containers, such as trees, for more specialized needs.

The C++ library has a basic suite of containers (deques, lists, maps, sets, and vectors), but more important, it has generic algorithms, which are function templates that implement common algorithms, such as searching and sorting. The algorithms operate on iterators, which are an abstraction of pointers that apply to any container or other sequence.

This chapter presents an overview of the standard C++ containers, the iterators used to examine the containers, and the algorithms that can be used with the iterators. It covers how to use the standard containers, iterators, and algorithms, as well as how to write your own.

In this chapter, the mathematical notation of [first, last) is often used to denote a range. The square bracket marks an inclusive endpoint of a range, and the parenthesis marks an exclusive endpoint of a range. Thus, [first, last) means a range that extends from first to last, including first, but excluding last. Read more about ranges in "Iterators" later in this chapter.

For complete details about the various containers, iterators, and algorithms, see Chapter 13.

The fundamental purpose of a container is to store multiple objects in a single container object. Different kinds of containers have different characteristics, such as speed, size, and ease of use. The choice of container depends on the characteristics and behavior you require.

In C++, the containers are implemented as class templates, so you can store anything in a container. (Well, almost anything. The type must have value semantics, which means it must behave as an ordinary value, such as an int. Values can be copied and assigned freely. An original and its copy must compare as equal. Some containers impose additional restrictions.)

The standard containers fall into two categories: sequence and associative containers. A sequence container preserves the original order in which items were added to the container. An associative container keeps items in ascending order (you can define the order relation) to speed up searching. The standard containers are:

The set and map containers perform insertions, deletions, and searches in logarithmic time, which implies a tree or tree-like implementation. Items must be kept in sorted order, so a hash table implementation is not allowed. Many programmers consider the lack of a standard hash table to be a serious omission. When the C++ standard is revised, a hash table is likely to be added. The STLport project includes hash table-based containers. See Appendix B for information about STLport.

This section presents the rules that govern containers. Some rules apply to all the standard containers, and you can rely on the standard behavior for all C++ implementations. Other rules apply only to sequence containers, and some apply only to associative containers. If you write your own container class template, be sure to follow the same conventions and rules that apply to the standard containers.

Most of the standard member functions have a complexity that is constant or linear in the number of elements in the container. Some of the member functions for associative members are logarithmic in the number of elements in the container. Each of the descriptions in this section notes the complexity of the function.

A container template should have the following constructors. You do not need to write separate constructors in all cases; sometimes you can use default arguments instead of overloading. If an allocator object is supplied, it is copied to the container; otherwise, a default allocator is constructed. In each of the following descriptions, container is the name of the container class template.

A sequence container should have the following additional constructors:

An associative container should have the following additional constructors:

All containers must have a destructor:

Many member functions are the same for all the container templates, and when you write your own container template, you should implement the same member functions with the same behaviors. Some are specific to sequence containers, and some to associative containers. Each container template can define additional members.

iterator begin ( ), const_iterator begin ( ) const

Returns an iterator that points to the first item of the container. Complexity is constant.

void clear ( )

Erases all the items in the container. Complexity is linear.

bool empty ( ) const

Returns true if the container is empty (size( ) == 0). Complexity is constant.

iterator end ( ), const_iterator end ( ) const

Returns an iterator that points to one past the last item of the container. Complexity is constant. (See Section 10.2 later in this chapter for a discussion of what "one past the last item" means.)

erase (iterator p)

Erases the item that p points to. For a sequence container, erase returns an iterator that points to the item that comes immediately after the deleted item or end( ). Complexity depends on the container.

For an associative container, erase does not return a value. Complexity is constant (amortized over many calls).

erase (iterator first, last)

Erases all the items in the range [first, last). For a sequence container, erase returns an iterator that points to the item that comes immediately after the last deleted item or end( ). Complexity depends on the container.

For an associative container, erase does not return a value. Complexity is logarithmic, plus last - first.

size_type max_size ( ) const

Returns the largest number of items the container can possibly hold. Although many containers are not constrained, except by available memory and the limits of size_type, other container types, such as an array type, might have a fixed maximum size. Complexity is usually constant.

container& operator= (const container& that)

Erases all items in this container and copies all the items from that. Complexity is linear in size( ) + that.size( ).

size_type size ( ) const

Returns the number of items in the container. Complexity is usually constant.

void swap (const container& that)

Swaps the elements of this container with that. An associative container also swaps the comparison function or functions. Complexity is usually constant.

Each container should have all of its equality and relational operators defined, either as member functions or, preferably, as functions at the namespace level. Namespace-level functions offer more flexibility than member functions. For example, the compiler can use implicit type conversions on the lefthand operand but only if the function is not a member function.

A container that supports bidirectional iterators should define rbegin and rend member functions to return reverse iterators.

The following functions are optional. The standard containers provide only those functions that have constant complexity.

A sequence container should define the following member functions. The complexity of each depends on the container type.

An associative container should define the following member functions. In the descriptions of the complexity, N refers to the number of elements in the container, M refers to the number of elements in the argument range (e.g., last - first), and count is the value that the function returns. Some of these member functions seem to duplicate standard algorithms (as discussed in Section 10.3 later in this chapter), but the associative containers can implement them with better performance than the generic algorithms.

size_type count (const key_type& k) const

Returns the number of items equivalent to k. Complexity is log N + count.

pair<const_iterator,const_iterator>, equal_range (const key_type& k) const, pair<iterator,iterator> equal_range (const key_type& k)

Returns the equivalent of make_pair(lower_bound(k), upper_bound(k)). Complexity is log N.

size_type erase (const key_type& k)

Erases all the items equivalent to k. Returns the number of items erased. Complexity is log N + count.

const_iterator find (const key_type& k) const, iterator find (const key_type& k)

Finds an item equivalent to k and returns an iterator that points to one such item, or end( ) if not found. Complexity is log N.

insert (const value_type& x)

Inserts x. If the container permits duplicate keys, insert returns an iterator that points to the newly inserted item. If the container requires unique keys, insert returns pair<iterator,bool>, in which the first element of the pair is an iterator that points to an item equivalent to x, and the second element is true if x was inserted or false if x was already present in the container. Complexity is log N.

iterator insert (iterator p, const value_type& x)

Inserts x and returns an iterator that points to x. The iterator p hints to where x might belong. Complexity is log N in general, but is constant (amortized over many calls) if the hint is correct, that is, if x is inserted immediately after p.

template<InIter>, void insert (InIter first, InIter last)

Copies the items from [first, last) and inserts each item in the container. Complexity is M log(N + M), but is linear if the range is already sorted.

key_compare key_comp ( ) const

Returns the key compare function or functor. Complexity is constant.

const_iterator lower_bound (const key_type& k) const, iterator lower_bound (const key_type& k)

Returns an iterator that points to the first item in the container that does not come before k. That is, if k is in the container, the iterator points to the position of its first occurrence; otherwise, the iterator points to the first position where k should be inserted. Complexity is log N.

value_compare value_comp ( ) const

Returns the value compare function or functor. Complexity is constant.

const_iterator upper_bound (const key_type& k) const, iterator upper_bound (const key_type& k)

Returns an iterator that points to the first item in the container that comes after all occurrences of k. Complexity is log N.

The standard containers are designed to be robust in the face of exceptions. The exceptions that the containers themselves can throw are well-defined (for example, at might throw out_of_range), and most member functions do not throw any exceptions of their own.

If a single value is being added to a container (by calling insert, push_front, or push_back), and an exception is thrown, the container remains in a valid state without adding the value to the container.

When inserting more than one value, different containers have different behaviors. A list, for example, ensures that all items are inserted or none, that is, if an exception is thrown, the list is unchanged. A map or set, however, ensures only that each individual item is inserted successfully. If an exception is thrown after inserting some of the items from a range, the destination container retains the elements that had been inserted successfully.

The erase, pop_back, and pop_front functions never throw exceptions.

The swap function throws an exception only if an associative container's Compare object's copy constructor or assignment operator throws an exception.

Example 10-1 shows an slist container, which implements a singly-linked list. A singly-linked list requires slightly less memory than a doubly-linked list but offers, at best, a forward iterator, not a bidirectional iterator.

Example 10-1. Implementing a custom container: a singly-linked list

// Simple container for singly-linked lists.
template<typename T, typename Alloc = std::allocator<T> >
class slist {
  // Private type for a link (node) in the list.
  template<typename U>
  struct link {
    link* next;
    U value;
  };
  typedef link<T> link_type;

public:
  typedef typename Alloc::reference reference;
  typedef typename Alloc::const_reference const_reference;
  typedef typename Alloc::pointer pointer;
  typedef typename Alloc::const_pointer const_pointer;
  typedef Alloc allocator_type;
  typedef T value_type;
  typedef size_t size_type;
  typedef ptrdiff_t difference_type;

  class iterator;       // SeeSection 10.2 later in this chapter for 
  class const_iterator; // the iterators.

  slist(const slist& that);
  slist(const Alloc& alloc = Alloc(  ));
  slist(size_type n, const T& x, const Alloc& alloc=Alloc(  ));
  template<typename InputIter>
  slist(InputIter first, InputIter last,
        const Alloc& alloc = Alloc(  ));
  ~slist(  )                             { clear(  ); }

  slist& operator=(const slist& that);
  allocator_type get_allocator(  ) const { return alloc_; }

  iterator begin(  )           { return iterator(0, head_); }
  const_iterator begin(  ) const
    { return const_iterator(0, head_); }
  iterator end(  )             { return iterator(0, 0); }
  const_iterator end(  ) const { return const_iterator(0, 0); }

  void pop_front(  )                  { erase(begin(  )); }
  void push_front(const T& x)         { insert(begin(  ), x); }
  T front(  )                   const { return head_->value; }
  T& front(  )                        { return head_->value; }

  iterator insert(iterator p, const T& x);
  void insert(iterator p, size_type n, const T& x);
  template<typename InputIter>
  void insert(iterator p, InputIter first, InputIter last);

  iterator erase(iterator p);
  iterator erase(iterator first, iterator last);

  void clear(  )               { erase(begin(  ), end(  )); }
  bool empty(  )         const { return size(  ) == 0; }
  size_type max_size(  ) const
    { return std::numeric_limits<difference_type>::max(  ); }
  void resize(size_type sz, const T& x = T(  ));
  size_type size(  )     const { return count_; }
  void swap(slist& that);

private:
  typedef typename
    allocator_type::template rebind<link_type>::other
    link_allocator_type;

  link_type* newitem(const T& x, link_type* next = 0);
  void delitem(link_type* item);

  template<typename InputIter>
  void construct(InputIter first, InputIter last,
                 is_integer_tag);

  template<typename InputIter>
  void construct(InputIter first, InputIter last,
                 is_not_integer_tag);

  link_type* head_;
  link_type* tail_;
  size_t count_;
  allocator_type alloc_;
  link_allocator_type linkalloc_;
};

// Constructor. If InputIter is an integral type, the standard requires the
// constructor to interpret first and last as a count and value, and perform the
// slist(size_type, T) constructor. Use the is_integer trait to dispatch to the
// appropriate construct function, which does the real work.
template<typename T, typename A>
template<typename InputIter>
slist<T,A>::slist(InputIter first, InputIter last,
                  const A& alloc)
: alloc_(alloc), linkalloc_(link_allocator_type(  )),
  head_(0), tail_(0), count_(0)
{
  construct(first, last, is_integer<InputIter>::tag(  ));
}

template<typename T, typename A>
template<typename InputIter>
void slist<T,A>::construct(InputIter first, InputIter last,
                           is_integer_tag)
{
  insert(begin(  ), static_cast<size_type>(first),
         static_cast<T>(last));
}

template<typename T, typename A>
template<typename InputIter>
void slist<T,A>::construct(InputIter first, InputIter last,
                           is_not_integer_tag)
{
  insert(begin(  ), first, last);
}

// Private function to allocate a new link node.
template<typename T, typename A>
typename slist<T,A>::link_type*
   slist<T,A>::newitem(const T& x, link_type* next)
{
  link_type* item = linkalloc_.allocate(1);
  item->next = next;
  alloc_.construct(&item->value, x);
  return item;
}

// Private function to release a link node.
template<typename T, typename A>
void slist<T,A>::delitem(link_type* item)
{
  alloc_.destroy(&item->value);
  linkalloc_.deallocate(item, 1);
}

// Basic insertion function. All insertions eventually find their way here.
// Inserting at the head of the list (p == begin(  )) must set the head_ member.
// Inserting at the end of the list (p == end(  )) means appending to the list,
// which updates the tail_'s next member, and then sets tail_. Anywhere else in
// the list requires updating p.prev_->next. Note that inserting into an empty
// list looks like inserting at end(  ). Return an iterator that points to the
// newly inserted node.
template<typename T, typename A>
typename slist<T,A>::iterator
  slist<T,A>::insert(iterator p, const T& x)
{
  // Allocate the new link before changing any pointers. If newitem throws an
  // exception, the list is not affected.
  link_type* item = newitem(x, p.node_);
  if (p.node_ == 0) {
    p.prev_ = tail_;
    // At end
    if (tail_ == 0)
      head_ = tail_ = item; // Empty list
    else {
      tail_->next = item;
      tail_ = item;
    }
  }
  else if (p.prev_ == 0)
    head_ = item;          // New head of list
  else
    p.prev_->next = item;
  p.node_ = item;
  ++count_;
  return p;
}

// Erase the item at p. All erasures come here eventually. If erasing begin(  ),
// update head_. If erasing the last item in the list, update tail_. Update the
// iterator to point to the node after the one being deleted.
template<typename T, typename A>
typename slist<T,A>::iterator slist<T,A>::erase(iterator p)
{
  link_type* item = p.node_;
  p.node_ = item->next;
  if (p.prev_ == 0)
    head_ = item->next;
  else
    p.prev_->next = item->next;
  if (item->next == 0)
    tail_ = p.prev_;
  --count_;
  delitem(item);
  return p;
}

// Comparison functions are straightforward.
template<typename T>
bool operator==(const slist<T>& a, const slist<T>& b)
{
  return a.size(  ) == b.size(  ) &&
         std::equal(a.begin(), a.end(  ), b.begin(  ));
}

A container holds stuff. Naturally, you need to know how to add stuff to a container, remove stuff from a container, find stuff in a container, and so on.

To remove an item from a container, call an erase member function. All containers have at least two erase functions: one that takes a single iterator to delete the item that the iterator points to, and another two iterators to delete every item in the range. Associative containers also have an erase function that takes a value as an argument to erase all matching items.

The standard containers are designed to be used with the standard iterators (see Section 10.2 later in this chapter) and standard algorithms (see Section 10.3 later in this chapter). The standard algorithms offer much more functionality than the containers' member functions, but they also have limitations. In particular, the standard algorithms cannot insert or erase items. For example, among the standard algorithms are remove and remove_if. Their names are suggestive but misleading. They do not remove anything from the container. Instead, they rearrange the elements of the container so that the items to retain are at the beginning. They return an iterator that points to the first item to be erased. Call erase with this iterator as the first argument and end( ) as the second to erase the items from the container. This two-step process is needed because an iterator cannot erase anything. The only way to erase an item from a container is to call a member function of the container, and the standard algorithms do not have access to the containers, only to iterators. Example 10-2 shows how to implement a generic erase function that calls remove and then the erase member function.

The standard algorithms provide several ways to search for items in a container: adjacent_find, find, find_end, find_first_of, find_if, search, and search_n. These algorithms essentially perform a linear search of a range. If you know exactly which item you want, you can search an associative container much faster by calling the find member function. For example, suppose you want to write a generic function, contains, that tells you whether a container contains at least one instance of an item. Example 10-3 shows one way, which relies on find, to implement this function.

Example 10-3. Determining whether a container contains an item

// Need a type trait to tell us which containers are associative and which are
// not (seeChapter 8).
struct associative_container_tag {};
struct sequence_container_tag {};

template<typename C>
struct is_associative
{};

template<typename T, typename A>
struct is_associative<std::list<T,A> >
{
  typedef sequence_container_tag tag;
};
// Ditto for vector and deque

template<typename T, typename C, typename A>
struct is_associative<std::set<T,C,A> >
{
  typedef associative_container_tag tag;
};
// Ditto for multiset, map, and multimap

template<typename C, typename T>
inline bool do_contains(const C& c, const T& item,
                        associative_container_tag)
{
  return c.end(  ) != c.find(item);
}

template<typename C, typename T>
inline bool do_contains(const C& c, const T& item,
                        sequence_container_tag)
{
  return c.end() != std::find(c.begin(  ), c.end(  ), item);
}

// Here is the actual contains function. It dispatches to do_contains, picking
// the appropriate overloaded function depending on the type of the container c.
template<typename C, typename T>
bool contains(const C& c, const T& item)
{
  return do_contains(c, item, is_associative<C>::tag(  ));
}

As you can see, iterators are important for using containers. You need them to insert at a specific position, identify an item for erasure, or specify ranges for algorithms. The next section discusses iterators in more depth.