Boost logo

Boost Users :

Subject: [Boost-users] Boost.Functional/Hash
From: Andrej van der Zee (andrejvanderzee_at_[hidden])
Date: 2010-09-02 05:06:50


Hi,

I have a question that I hope can be answered by using Boost. I have
the following datastructure:

struct ip_info
{
  const u_int8_t _ifile_index;
  const unsigned long _src_ip;
  const unsigned long _dst_ip;
  const u_int16_t _id;
  const u_int16_t _length;
  const u_int16_t _src_port;
  const u_int16_t _dst_port;
  const u_int32_t _seq;
  const u_int32_t _ack_seq;
};

The object is identified by all its members. Now I want to count
duplicates. But I have MANY of these objects, I mean billions. I have
to reduce the memory footprint of my program, because it soon grows
over 4GB and boom! So I rather not keep all the objects in memory and
why would I, because I do not need them. I just need to count
duplicates.

My idea was to hash such objects to a smaller values and use that hash
value as key_type in a map. The value_type of the map would be an
integer. This something like std::map<int32_t, u_int16_t> or
something.

Then I though maybe use Boost.Functional/Hash, but are there any
guarantees that the hash value is unique? How unique are they? Are
there better alternatives?

Thank you,
Andrej


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net