Subject: [Boost-bugs] [Boost C++ Libraries] #12919: multi_index indexed with hashed_non_unique produces buffer overrun.
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2017-03-20 22:57:58
#12919: multi_index indexed with hashed_non_unique produces buffer overrun.
---------------------------------------+-------------------------
Reporter: Alex Huszagh <ahuszagh@â¦> | Owner: joaquin
Type: Bugs | Status: new
Milestone: To Be Determined | Component: multi_index
Version: Boost 1.63.0 | Severity: Problem
Keywords: multi_index |
---------------------------------------+-------------------------
Using multi_index_container on a struct indexing using hashed_non_unique
(hashing a std::string) can cause a buffer overrun in
unchecked_rehash(size_type,hashed_non_unique_tag), typically culminating
in a segfault.
The segment below, taken from lines 1351-1372 of
boost/multi_index/hash_index.hpp, can iterate between two different nodes,
causing `x==end_` to never occur, causing a auto_space to be overrun,
eventually leading to a segmentation fault.
{{{
#!div style="font-size: 80%"
Code highlighting:
{{{#!cpp
auto_space<
std::size_t,allocator_type> hashes(get_allocator(),size());
auto_space<
node_impl_pointer,allocator_type>
node_ptrs(get_allocator(),size());
std::size_t i=0;
bool within_bucket=false;
BOOST_TRY{
for(;;++i){
node_impl_pointer x=end_->prior();
if(x==end_)break;
/* only this can possibly throw */
std::size_t h=hash_(key(node_type::from_impl(x)->value()));
hashes.data()[i]=h;
node_ptrs.data()[i]=x;
std::pair<node_impl_pointer,bool> p=
node_alg::unlink_last_group(end_);
node_alg::link_range(
p.first,x,buckets_cpy.at(buckets_cpy.position(h)),cpy_end);
within_bucket=!(p.second);
}
}
}}}
}}}
The solution is quite simple, and is shown in the other implementation of
unchecked_rehash(size_type,hashed_unique_tag), by ensuring the rehash does
not exceed the number of elements. The proper implementation (possibly
suboptimal) would look something like this:
{{{
#!div style="font-size: 80%"
Code highlighting:
{{{#!cpp
auto_space<
std::size_t,allocator_type> hashes(get_allocator(),size());
auto_space<
node_impl_pointer,allocator_type>
node_ptrs(get_allocator(),size());
std::size_t i=0, size_=size();
bool within_bucket=false;
BOOST_TRY{
for(;i!=size_;++i){
node_impl_pointer x=end_->prior();
if(x==end_)break;
/* only this can possibly throw */
std::size_t h=hash_(key(node_type::from_impl(x)->value()));
hashes.data()[i]=h;
node_ptrs.data()[i]=x;
std::pair<node_impl_pointer,bool> p=
node_alg::unlink_last_group(end_);
node_alg::link_range(
p.first,x,buckets_cpy.at(buckets_cpy.position(h)),cpy_end);
within_bucket=!(p.second);
}
}
}}}
}}}
As the data leading to this segfault is private, I currently do not have a
minimal, verified example to reproduce the bug, however, I believe this is
a rational explanation, with a straight-forward fix.
I've seen examples where an ~800 element buffer will continue until i >
15,000, culminating in a segfault. I will submit a pull request with the
above diff.
Thanks,
Alex
-- Ticket URL: <https://svn.boost.org/trac/boost/ticket/12919> Boost C++ Libraries <http://www.boost.org/> Boost provides free peer-reviewed portable C++ source libraries.
This archive was generated by hypermail 2.1.7 : 2017-03-20 23:01:14 UTC