65.9K
CodeProject is changing. Read more.
Home

An STL compliant sorted vector

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.82/5 (9 votes)

Nov 21, 2002

7 min read

viewsIcon

150889

downloadIcon

1159

A template container which implements set/multiset functionality using a vector

Introduction

sorted_vector adapts a std::vector to the interface required by std::set/std::multiset, thereby providing set/multiset and vector functionality (random access) in one container.

Tests show that sorted_vector's element retrieval (find) speed outperforms that of std::set/std::multiset by a factor of 2 (on most machines). On the downward side is the poor performance of sorted_vector's insert member function, because it inserts an element somewhere in the middle of the sorted vector in order to preserve the sort order. By using sorted_vector's low level interface one can solve this performance bottleneck by inserting a bunch of objects using a series of push_back's followed by a call to the member function sort, which restores the sort order.

sorted_vector should be preferred over std::set/std::multiset if many small elements must be stored. Most STL implementations use a variant of a balanced tree to implement set and multiset, which imposes a per element storage overhead of 3 pointers. The most important reason to use a std::set/std::multiset instead of sorted_vector is the need to keep iterators into a set/multiset while inserting more elements into the set. These iterators remain valid in the case of a set/multiset, but are invalidated in the case of a sorted_vector container.

Namespace

sorted_vector resides in the namespace codeproject.

Basic Usage

The following small table shows corresponding declarations of sorted_vector's and set's/multiset's.

STL concept std library sorted_vector
set set<Key,Pred> sorted_vector<Key,true,Pred>
multiset mutliset<Key,Pred> sorted_vector<Key,Pred>

(sorted_vector<Key,Pred,true> means sorted_vector<Key,Pred,bNoDuplicates=true>)

The various set/multiset insert and erase member functions as well as the set/multiset functions count, lower_bound,upper_bound, equal_range are part of sorted_vector's interface and behave the same as there set/multiset counterparts. The following code snippet shows the use of a sorted_vector to hold the set intersection of the content of two std::vector's.

#pragma warning(disable:4786)
#include "sorted_vector.h"
size_t             //build intersection using set interface of sorted_vector
BuildIntersection(  const std::vector<int>& v0,const std::vector<int>& v1,
                    codeproject::sorted_vector<int,true>& svecIntersection)
{		
    codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
    for(int i=0;i<v1.size();i++){
        if( svec.find(v1[i])!=svec.end() ){
            svecIntersection.insert(v1[i]);
        }
    }
    return svecIntersection.size();
}

The code example shows the use of the member functions find and insert. If you replace codeproject::sorted_vector<int,true> by std::set<int> you get exactly the same result. This piece of code can be optimized for speed by replacing the insert calls in the loop by calls to push_back of the base container (a vector) followed by a call to the member function sort at the end of the loop.

#pragma warning(disable:4786)
#include "sorted_vector.h"
size_t             //same as previous example, optimized insertions 
BuildIntersection1(  const std::vector<int>& v0,const std::vector<int>& v1,
                    codeproject::sorted_vector<int,true>& svecIntersection)
{		
    codeproject::sorted_vector<int,true> svec(v0.begin(),v0.end());
    codeproject::sorted_vector<int,true>::Cont& vInterSect 
        = svecIntersection.get_container();
    for(int i=0;i<v1.size();i++){
        if( svec.find(v1[i])!=svec.end() ){
            vInterSect.push_back(v1[i]);
        }
    }
    svecIntersection.sort();
    return svecIntersection.size();
}

Interface of sorted_vector

Member Coming from Description
Cont sorted_vector container type, the type of the container used to store the controlled sequence.
value_type vector The type of object, T, stored in the set.
key_type set/multiset The key type associated with value_type.
key_compare set/multiset Function object that compares two keys for ordering.
value_compare set/multiset Function object that compares two values for ordering.
pointer vector Pointer to T.
reference vector Reference to T
const_reference vector Const reference to T
size_type vector An unsigned integral type.
difference_type vector A signed integral type.
iterator vector Random access iterator used to iterate through a vector.
const_iterator vector Random access const iterator used to iterate through a vector
reverse_iterator vector Random access iterator used to iterate backwards through a vector.
const_reverse_iterator vector Random access const iterator used to iterate backwards through a vector.
iterator begin() const vector Returns an iterator pointing to the beginning of the sorted_vector.
iterator end() const vector Returns an iterator pointing to the end of the sorted_vector.
reverse_iterator rbegin() const vector Returns a reverse_iterator pointing to the beginning of the reversed sorted_vector
reverse_iterator rend() const vector Returns a reverse_iterator pointing to the end of the reversed sorted_vector.
size_type size() const vector Returns the size of the sorted_vector.
size_type max_size() const vector Returns the largest possible size of the sorted_vector.
bool empty() const vector true if the sorted_vec's size is 0.
key_compare key_comp() const set/multiset Returns the key_compare object used by the sorted_vector.
value_compare value_comp() const set/multiset Returns the value_compare object used by the sorted_vector.
sorted_vector() set/multiset Creates an empty sorted_vector.
sorted_vector(const key_compare& comp) set/multiset Creates an empty sorted_vector, using comp as the key_compare object.
VC++7.0:
template <class InputIterator>
sorted_vector(InputIterator f,
InputIterator l)

VC++6.0:
sorted_vector(const_iterator f, const_iterator l)
set/multiset Creates a sorted_vector with a copy of a range.
VC++7.0:
template <class InputIterator>
sorted_vector(InputIterator f,
InputIterator l,const key_compare& comp)

VC++6.0:
sorted_vector(const_iterator f, const_iterator l,const key_compare& comp)
set/multiset Creates a sorted_vector with a copy of a range, using comp as the key_compare object.
sorted_vector(const sorted_vector&) set/multiset The copy constructor.
sorted_vector& operator=(const sorted_vector&) set/multiset The assignment operator (assigns other sorted_vector )
sorted_vector& operator=(const Cont&) sorted_vector The assignment operator (assigns other vector<T> )
void swap(sorted_vector&) set/multiset Swaps the contents of two sorted_vector's
pair<iterator, bool>insert(const value_type& x) set/multiset Inserts x into the sorted_vector.
iterator insert(iterator pos, const value_type& x) set/multiset Inserts x into the sorted_vector, using pos as a hint to where it will be inserted.
VC++7.0:
template <class InputIterator>
void insert(InputIterator f, InputIterator l)

VC++6.0:
void insert(const_iterator first f, const_iterator l)
set/multiset Inserts a range into the sorted_vector.
void erase(iterator pos) vector Erases the element pointed to by pos.
size_type erase(const key_type& k) set/multiset Erases the element whose key is k.
void erase(iterator first, iterator last) vector Erases all elements in a range.
void clear() vector Erases all of the elements.
iterator
find(const key_type& k) const
set/multiset Finds an element whose key is k.
size_type
count(const key_type& k) const
set/multiset Counts the number of elements whose key is k.
iterator
lower_bound(const key_type& k) const
set/multiset Finds the first element whose key is not less than k.
iterator
upper_bound(const key_type& k) const
set/multiset Finds the first element whose key is greater than k.
pair<iterator, iterator>
equal_range(const key_type& k) const
set/multiset Finds a range containing all elements whose key is k.
bool operator==(const sorted_vector&,const sorted_vector&) vector Tests two sorted_vector for equality. This is a global function, not a member function. There is also an operator !=
bool operator<(const sorted_vector&, const sorted_vector&) vector Lexicographical comparison. This is a global function, not a member function. There are also operators <= >= and >
Cont& get_container() sorted_vector Returns a reference to the internal vector used to store the controlled sequence
void sort() sorted_vector Restores the sort order using key_compare
reference at(size_type i) sorted_vector Returns a reference to the element at *(begin()+i) with range check
const_reference at(size_type i) const sorted_vector Returns a reference to the element at *(begin()+i) with range check
const_reference operator[](size_type i) const sorted_vector Returns a reference to the element at *(begin()+i)
reference operator[](size_type i) sorted_vector Returns a reference to the element at *(begin()+i)
const_reference front() const sorted_vector Returns a reference to the first element
reference front() sorted_vector Returns a reference to the first element
reference back() sorted_vector Returns a reference to the last element
const_reference back() const sorted_vector Returns a reference to the last element
void pop_back() sorted_vector Removes the last element from the sorted_vector

Points of Interest

An interview with Alexander Stepanov (inventor of the STL) at http://www.stlport.org/resources/StepanovUSA.html , in which he proposed to implement a sorted container as an adapter of a std::container gave me the idea for this work. The real surprise to me was the unexpected good performance of sorted_vector compared to set/multiset. The outcome of my tests indicate, that the std::set/std::multiset has only one (important) advantage over a sorted vector, namely that iterators remain valid in a set/multiset, even when more elements are added.

The implementation work itself was easy and did not require any advanced C++/STL feature. It can be summarized as follows: Most of the member functions of sorted_vector forward the call to the vec_ data member, which is a std::vector. The set/multiset specific functions for inserting/erasing and locating elements mostly use STL algorithms to deal with the sorted sequence present in the vec_ data member. The low level interface consisting of the member functions get_container() (which returns a reference to the vec_ data member) and sort and stable_sort (which restore the sort order) are necessary to improve insertion performance (by allowing a temporary violation of the sorting order). Only one function was unexpectedly difficult to implement: The function unique, which must be called after sort and stable_sort in the case of a set. This function is part of the STL and requires a predicate as third argument. This predicate must return true, if the passed elements are equal. But the class sorted_vector only has access to a predicate which evaluates the < relation. In theory, it should be possible, to transform a predicate evaluating the < relation into another predicate evaluating ==, but in practice this is only possible, when the < predicate is implemented as object derived from std::binary_function. Ultimately, I had to implement unique myself because of that.

History

  • 1.0: November 19th, 2002; Initial release.
  • 1.1: November 28th, 2002; Documentation and code Update
    • changed namespace from std to codeproject
    • supports member templates for constructing/inserting from iterator range in case of VC++7.0