Boost logo

Boost :

From: Thomas Maeder (maeder_at_[hidden])
Date: 2001-11-25 09:06:39

While looking for a performance bottleneck in a program, I found out
some interesting things about the allocation behavior of regex_match.

The regular expression I looked at is suggested by [1]:

  boost::regex const URIexp("(?:([^:/?#]+):)?"
  (scheme, host:port, path, query,

By providing my own allocator to the query_matches object used, I
noticed the following allocations:
- few allocations of relatively small chunks (<100 chars)
- hundreds of allocations of chunks of size 164 chars
- at most 4 or 5 chunks of 164 chars are used at the same time

The number 164 seems to be related to the size and complexity of URIexp.

I wrote myself a small, simple allocator to be used by query_matches; I
don't think that it is very portable currently. Instead of freeing
chunks attempted to be deallocate()d, it adds them to a singly linked
list and reuses them upon successive allocations of chunks of the same
size. It has significantly improved the performance of the entire program.

- is my analysis just accurate for my case or is what I observed the
general behavior of query_match?
- am I reinventing anything?
- is there interest to boostify this allocator?


[1] Uniform Resource Identifiers (URI): Generic Syntax

Boost list run by bdawes at, gregod at, cpdaniel at, john at