[Boost-bugs] [Boost C++ Libraries] #12919: multi_index indexed with hashed_non_unique produces buffer overrun.

Subject: [Boost-bugs] [Boost C++ Libraries] #12919: multi_index indexed with hashed_non_unique produces buffer overrun.
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2017-03-20 22:57:58


#12919: multi_index indexed with hashed_non_unique produces buffer overrun.
---------------------------------------+-------------------------
 Reporter: Alex Huszagh <ahuszagh@…> | Owner: joaquin
     Type: Bugs | Status: new
Milestone: To Be Determined | Component: multi_index
  Version: Boost 1.63.0 | Severity: Problem
 Keywords: multi_index |
---------------------------------------+-------------------------
 Using multi_index_container on a struct indexing using hashed_non_unique
 (hashing a std::string) can cause a buffer overrun in
 unchecked_rehash(size_type,hashed_non_unique_tag), typically culminating
 in a segfault.

 The segment below, taken from lines 1351-1372 of
 boost/multi_index/hash_index.hpp, can iterate between two different nodes,
 causing `x==end_` to never occur, causing a auto_space to be overrun,
 eventually leading to a segmentation fault.

 {{{
 #!div style="font-size: 80%"
 Code highlighting:
   {{{#!cpp
       auto_space<
         std::size_t,allocator_type> hashes(get_allocator(),size());
       auto_space<
         node_impl_pointer,allocator_type>
 node_ptrs(get_allocator(),size());
       std::size_t i=0;
       bool within_bucket=false;
        BOOST_TRY{
         for(;;++i){
           node_impl_pointer x=end_->prior();
           if(x==end_)break;

           /* only this can possibly throw */
           std::size_t h=hash_(key(node_type::from_impl(x)->value()));

           hashes.data()[i]=h;
           node_ptrs.data()[i]=x;
           std::pair<node_impl_pointer,bool> p=
             node_alg::unlink_last_group(end_);
           node_alg::link_range(
             p.first,x,buckets_cpy.at(buckets_cpy.position(h)),cpy_end);
           within_bucket=!(p.second);
         }
       }
    }}}
 }}}

 The solution is quite simple, and is shown in the other implementation of
 unchecked_rehash(size_type,hashed_unique_tag), by ensuring the rehash does
 not exceed the number of elements. The proper implementation (possibly
 suboptimal) would look something like this:

 {{{
 #!div style="font-size: 80%"
 Code highlighting:
   {{{#!cpp
       auto_space<
         std::size_t,allocator_type> hashes(get_allocator(),size());
       auto_space<
         node_impl_pointer,allocator_type>
 node_ptrs(get_allocator(),size());
       std::size_t i=0, size_=size();
       bool within_bucket=false;
        BOOST_TRY{
         for(;i!=size_;++i){
           node_impl_pointer x=end_->prior();
           if(x==end_)break;

           /* only this can possibly throw */
           std::size_t h=hash_(key(node_type::from_impl(x)->value()));

           hashes.data()[i]=h;
           node_ptrs.data()[i]=x;
           std::pair<node_impl_pointer,bool> p=
             node_alg::unlink_last_group(end_);
           node_alg::link_range(
             p.first,x,buckets_cpy.at(buckets_cpy.position(h)),cpy_end);
           within_bucket=!(p.second);
         }
       }
    }}}
 }}}

 As the data leading to this segfault is private, I currently do not have a
 minimal, verified example to reproduce the bug, however, I believe this is
 a rational explanation, with a straight-forward fix.

 I've seen examples where an ~800 element buffer will continue until i >
 15,000, culminating in a segfault. I will submit a pull request with the
 above diff.

 Thanks,
 Alex

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/12919>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2017-03-20 23:01:14 UTC