[container] New deque implementation and future plans

older
[static_string] Removing support...

Ion Gaztañaga

20 Sep 2025 20 Sep '25

10:49 a.m.

Hi, I've been working lately on a new deque implementation for Boost.Container. The original uses the classic SGI STL-based data structure, that is quite bulky. The new implementation is already on develop and ready to be merged to master for Boost 1.90 The new implementation will try to be faster than std implementations (I probably need to work more on that with benchmarks), but more importantly, can allow some additional features. Some changes: - sizeof(deque) was 10 words, now is 4 words. Probably the lightest implementation around. - sizeof(deque::iterator) was 4 words, now is is 2 words (similar to libc++ and MSVC).) - Several internal algorithms were reimplemented to take advantage of the segmented nature of deque. - Defaults were slightly changed, 64 bit platforms now use 1024 byte blocks by default instead of the SGI classic 512 byte blocks (still far from libc++'s 4K blocks). - The new implementation eases further deque-like variations and optimizations in the future. Some planned features - Add a "stored_size_type" option so that the sizeof deque can be further decreased (probably down to 3 words) for those in 64 bit platforms that don't need more than 4G elements in deque. - Add a single-ended variation ("seque" might be confusing name because the data structure would not longer be a queue, but a segment-based sequence with random-access). Some people use deque to have a random-access container that avoids vector-like reallocations and guarantees stable pointers and references when inserting on the back. A double-ended container is sometimes sub-optimal for those use cases. - Add options (or maybe different containers or aliases reusing parts of the implementation) that can offer "reserve/capacity"-like features, so that a programmer can pre-allocate all the blocks that's going to use avoiding allocation performance penalties on hot code. I'll update the "Draft Development Plan for Boost.Container" (https://docs.google.com/document/d/1VdMTheFUczC946dP3VKPseOoUq3sVKjU73oMvK8F...) with these proposals. Of course, I'm open to new proposals/features. If any booster is eager to test the new deque implementation or wants to suggest a benchmark/use-case that could be useful for Boost.Container users, please let me know! If anyone detects a bug in the new implementation please notify it and I'll try to solve it ASAP. Best, Ion

Show replies by date

Ceyhun Erturk

20 Sep 20 Sep

1:19 p.m.

Hi, May I kindly ask, what are major behavioral contrasts between seque variation and queue adapter (except user can able to end direction I suppose)? BR On Sat, Sep 20, 2025 at 17:50 Ion Gaztañaga via Boost <boost@lists.boost.org> wrote:

...

Hi,

I've been working lately on a new deque implementation for Boost.Container. The original uses the classic SGI STL-based data structure, that is quite bulky. The new implementation is already on develop and ready to be merged to master for Boost 1.90

The new implementation will try to be faster than std implementations (I probably need to work more on that with benchmarks), but more importantly, can allow some additional features. Some changes:

- sizeof(deque) was 10 words, now is 4 words. Probably the lightest implementation around.

- sizeof(deque::iterator) was 4 words, now is is 2 words (similar to libc++ and MSVC).)

- Several internal algorithms were reimplemented to take advantage of the segmented nature of deque.

- Defaults were slightly changed, 64 bit platforms now use 1024 byte blocks by default instead of the SGI classic 512 byte blocks (still far from libc++'s 4K blocks).

- The new implementation eases further deque-like variations and optimizations in the future.

Some planned features

- Add a "stored_size_type" option so that the sizeof deque can be further decreased (probably down to 3 words) for those in 64 bit platforms that don't need more than 4G elements in deque.

- Add a single-ended variation ("seque" might be confusing name because the data structure would not longer be a queue, but a segment-based sequence with random-access). Some people use deque to have a random-access container that avoids vector-like reallocations and guarantees stable pointers and references when inserting on the back. A double-ended container is sometimes sub-optimal for those use cases.

- Add options (or maybe different containers or aliases reusing parts of the implementation) that can offer "reserve/capacity"-like features, so that a programmer can pre-allocate all the blocks that's going to use avoiding allocation performance penalties on hot code.

I'll update the "Draft Development Plan for Boost.Container" ( https://docs.google.com/document/d/1VdMTheFUczC946dP3VKPseOoUq3sVKjU73oMvK8F...)

with these proposals. Of course, I'm open to new proposals/features.

If any booster is eager to test the new deque implementation or wants to suggest a benchmark/use-case that could be useful for Boost.Container users, please let me know!

If anyone detects a bug in the new implementation please notify it and I'll try to solve it ASAP.

Best,

Ion _______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/XJ2K26SJ...

Ion Gaztañaga

8:29 p.m.

El 20/09/2025 a las 15:19, Ceyhun Erturk via Boost escribió:

...

Hi, May I kindly ask, what are major behavioral contrasts between seque variation and queue adapter (except user can able to end direction I suppose)? BR

"seque" would be the straightforward name for a single-ended version of a "deque", but that name is really awful because it would mean "single-ended queue", and a single ended deque is NOT a queue. In a single-ended deque you can only push_back and pop_back (just like a vector), it would be a LIFO-like data structure, whereas in a true "single-ended queue" or "simple queue" insertion takes place at the rear and removal occurs at the front (FIFO). So a single-ended deque should not be called "seque", maybe something like "segmented_vector" and it would offer reference and pointer stability for back insertions. "vector" is not really adequate because a single-ended deque would not offer reserve/capacity methods, which I think every C++ expects from a "vector"-like data structure. Best, Ion

LoS

21 Sep 21 Sep

7:30 a.m.

Hi, Ion! Could you provide, if possible, the link to your new implementation of deque? Best regards, LoS On Sat, Sep 20, 2025, 10:31 PM Ion Gaztañaga via Boost < boost@lists.boost.org> wrote:

...

El 20/09/2025 a las 15:19, Ceyhun Erturk via Boost escribió:

...
Hi, May I kindly ask, what are major behavioral contrasts between seque variation and queue adapter (except user can able to end direction I suppose)? BR

"seque" would be the straightforward name for a single-ended version of a "deque", but that name is really awful because it would mean "single-ended queue", and a single ended deque is NOT a queue.

In a single-ended deque you can only push_back and pop_back (just like a vector), it would be a LIFO-like data structure, whereas in a true "single-ended queue" or "simple queue" insertion takes place at the rear and removal occurs at the front (FIFO).

So a single-ended deque should not be called "seque", maybe something like "segmented_vector" and it would offer reference and pointer stability for back insertions. "vector" is not really adequate because a single-ended deque would not offer reserve/capacity methods, which I think every C++ expects from a "vector"-like data structure.

Best,

Ion

_______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/LKN3FYV6...

Ion Gaztañaga

10:38 p.m.

El 21/09/2025 a las 9:30, LoS via Boost escribió:

...

Hi, Ion!

Could you provide, if possible, the link to your new implementation of deque?

Best regards, LoS

It's in the develop branch of the official github site: https://github.com/boostorg/container/tree/develop Waiting for some more test before merging to master before Boost 1.90 release. Best, Ion

LoS

23 Sep 23 Sep

12:37 p.m.

I don't know whether there are some guidelines or constraints that wouldn't make this possible, but I was thinking about a possible change for the iterator template. The current API requires to explicitly indicate the "Pointer" and "IsConst" parameters. Then, for an iterator class "Iter", the iterator and const_iterator types can be specified by setting the second template parameter to true and false, respectively. ```cpp typedef Iter<pointer, false> iterator; typedef Iter<pointer, true> const_iterator; ``` However, it'd be possible to remove "IsConst", and infer whether the iterator is constant or not from the "Pointer" type only. In this way, the implementation should also be a little cleaner. ```cpp typedef Iter<pointer> iterator; typedef Iter<const_pointer> const_iterator; ``` I find the latter way even more consistent with how reverse_iterator and const_reverse_iterator types are specified. Any feedback is much appreciated! Best regards, LoS On Mon, Sep 22, 2025, 12:38 AM Ion Gaztañaga <igaztanaga@gmail.com> wrote:

...

El 21/09/2025 a las 9:30, LoS via Boost escribió:

...
Hi, Ion!

Could you provide, if possible, the link to your new implementation of deque?

Best regards, LoS

It's in the develop branch of the official github site:

https://github.com/boostorg/container/tree/develop

Waiting for some more test before merging to master before Boost 1.90 release.

Best,

Ion

Ivan Matek

30 Sep 30 Sep

10:25 a.m.

On Sat, Sep 20, 2025 at 12:51 PM Ion Gaztañaga via Boost < boost@lists.boost.org> wrote:

...

I'll update the "Draft Development Plan for Boost.Container" ( https://docs.google.com/document/d/1VdMTheFUczC946dP3VKPseOoUq3sVKjU73oMvK8F...)

with these proposals. Of course, I'm open to new proposals/features.

Amazing document, even if people do not care about Container I suggest they read them for amazing collection of links it contains. Comments: 1. For B-Tree: could it support layout so that SIMD search for integers is possible? I mean at the level we compare multiple items and then check if we found one or we need to follow child pointers... maybe other implementations already do this 2. https://www.boost.org/doc/libs/1_89_0/doc/html/container/main_features.html#... seems potentially AMAZING for "real life"(aka huge) projects where extra include can kill compile times. I know document is about improvements(not to mention modules are gonna fix everything), not advertising but if I understand this correctly it can be huge advantage for some people. So maybe mention it. If anybody has real life experience with this please do share! 3. Discuss if different SSO buffer sizes types are movable or not. For example: will string with SSO max size 42 know to move from SSO of max size 32. Same for deque with diff block sizes... I presume answer is no, but I think it is worthwhile to say. 4. Document how can one run benchmarks on Container. I am not really a boost developer, but often I like to play with benchmarking code, so would be nice to know how hard/easy is to run some benchmarks on my machine. 5. should T in containers be constrained - I presume answer is no, since I presume concepts slow down compilation, but I thought it should be discussed, in case I presumed wrong :) 6. Are there any macro hacks to stamp out something(e.g. variadic templates emulation) whose cleanup would help with compile times. 7. Compile times - Unfortunately std:: implementors have privilege of including internal headers, Boost does not(although some people do tricks like declare std:: stuff or include internal headers and that can lead to breakages when std people reorganize internal headers). This means that generally std:/boost:: version of same thing do not cost same to compile, and often boost compile times are worse. 8. vector32: May be stupid idea with "noise" performance improvement, but I was always annoyed that vector uses 2 64 bit fields for size and capacity when(unless you use small types like char or short or int) my computer would ran out of memory before you could get vector that big. So for example vector32<std::string> can support up to 4GigaStrings(great unit) since it splits capacity and size in 2 32 bit fields(obviously data must be 64bit). This is all theoretical and I can not think of a benchmark where this improves performance that does not include realistic program. I mean I can "hack" a benchmark where with std::vector we spill out of L1 cache( if we have ton of std::vector) and we are in L1 cache with std::vector32, but seems very artificial. We would need RealLife™ application, typedef in std::headers std::vector to vector32 and see if anything noticeable happens with performance. I understand this is a ticking timebomb for certain usecases, but I guess people can make decision if they want the risk or not depending on their usecase. 9. resize_and_overwrite- ugly name, beautiful performance allegedly... I think it may be useful to add it so people can do std::format_to_n to SSO buffer of string without paying a copy(not to mention that if people use std::format std::string in it's api so you need to give them nice escape ;) ) 10. I asked people before on mailing list for std::optional<T&> API for Boost.Unordered, not a lot of fans although I was not the only proponent. C++ might get it <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3091r4.html#other-names-for-get>, but not before C++29... and it might get an uglier name than get. I still think it is great api for any associative container. 11. starts_with and ends_with for string. Issue here is that std:: has 3 overloads, one takes std::string_view, do you use boost or std::string_view? I would suggest std::one, but that makes the code C++17. Beside that implementation is trivial beside checking what happens for nullptr input. 12. Regarding support for pool in containers: might bet worthwhile to investigate if using indexes into a pool(as long as pool is contiguous) instead of pointers to save on child pointer sizes. I am aware it involves extra work to get final pointer. 13. Fuzzing would be nice, esp for rewrites, since we can presume users reported most of the bugs in older code, although obviously ideal is to fuzz everything. 14. SecureString - tbh I am worried about maintenance of this guy, it could be just my paranoia since IDK how elaborate future attacks(or compiler optimizations) might realistically be i.e. is it possible that secure erasure gets broken and it needs a hotfix asap. 15. GermanString - I would suggest picking better name, unless it really sticks in the community. https://www.reddit.com/r/programming/comments/1e5gzq2/comment/ldmj266/ Sorry for the manifesto, I do not expect you to go over each point and spend 20 min on each, this is just a quick dump of my first reactions when reading the document.

Andrey Semashev

1:12 p.m.

On 30 Sep 2025 13:25, Ivan Matek via Boost wrote:

...

8. vector32: May be stupid idea with "noise" performance improvement, but I was always annoyed that vector uses 2 64 bit fields for size and capacity when(unless you use small types like char or short or int) my computer would ran out of memory before you could get vector that big. So for example vector32<std::string> can support up to 4GigaStrings(great unit) since it splits capacity and size in 2 32 bit fields(obviously data must be 64bit).

This is already possible: template< typename T, typename Allocator = void, typename... Options > using vector32 = boost::container::vector< T, Allocator, boost::container::vector_options_t< boost::container::stored_size< std::uint32_t >, Options... >

...

;

You can also similarly define small_vector32, (small_)flat_set32, etc.

...

14. SecureString - tbh I am worried about maintenance of this guy, it could be just my paranoia since IDK how elaborate future attacks(or compiler optimizations) might realistically be i.e. is it possible that secure erasure gets broken and it needs a hotfix asap.

My understanding is that the idea is to not leak sensitive strings in memory after the string is destroyed. I think, it does have value (and I have needed it a few times). It can be implemented fairly reliably e.g. using a signal fence or a dummy asm block. And it can be unit-tested by using a custom memory allocator. I think, it would probably be preferable not in the form of a special string type but rather an allocator, so that the protection can be applied to a variety of containers (e.g. std::vector). Of course, string has SSO, and there are also containers with similar optimization (e.g. small_vector). There needs to be some collaboration between containers implementing SSO-like optimizations and the allocator. Maybe the allocator should provide an optional function that would be detected by the container and invoked on clear()/erase()/destructor.

...

15. GermanString - I would suggest picking better name, unless it really sticks in the community. https://www.reddit.com/r/programming/comments/1e5gzq2/comment/ldmj266/

Yes, I found this name surprising too.

Seth

2:05 p.m.

...

...
15. GermanString - I would suggest picking better name, unless it really sticks in the community. https://www.reddit.com/r/programming/comments/1e5gzq2/comment/ldmj266/

Yes, I found this name surprising too.

It reminds me a lot of Pascal strings. https://stackoverflow.com/questions/25068903/what-are-pascal-strings

Ion Gaztañaga

2 Oct 2 Oct

9:17 a.m.

El 30/09/2025 a las 16:05, Seth via Boost escribió:

...

...
...
15. GermanString - I would suggest picking better name, unless it really sticks in the community. https://www.reddit.com/r/programming/comments/1e5gzq2/comment/ldmj266/

Yes, I found this name surprising too.

It reminds me a lot of Pascal strings. https://stackoverflow.com/questions/25068903/what-are-pascal-strings

Thanks, very interesting link. The same can be applied to dynamic arrays or vectors, useful when an empty array/vector is common (say, an optional attribute). Empty vectors are reduced to "sizeof(void*)" and length (and capacity for vectors) are allocated inside the dynamic buffer. Best, Ion

Ion Gaztañaga

10 Oct 10 Oct

10:51 a.m.

El 30/09/2025 a las 12:25, Ivan Matek via Boost escribió:

...

On Sat, Sep 20, 2025 at 12:51 PM Ion Gaztañaga via Boost < boost@lists.boost.org> wrote:

...
I'll update the "Draft Development Plan for Boost.Container" ( https://docs.google.com/document/d/1VdMTheFUczC946dP3VKPseOoUq3sVKjU73oMvK8F...)

with these proposals. Of course, I'm open to new proposals/features.

Amazing document, even if people do not care about Container I suggest they read them for amazing collection of links it contains. Comments:

Thanks Ivan, I missed your reply.

...

1. For B-Tree: could it support layout so that SIMD search for integers is possible? I mean at the level we compare multiple items and then check if we found one or we need to follow child pointers... maybe other implementations already do this

I'm not an expert at B-trees at all, but in general generic implementation (like abseil) don't support SIMD.

...

3. Discuss if different SSO buffer sizes types are movable or not. For example: will string with SSO max size 42 know to move from SSO of max size 32. Same for deque with diff block sizes... I presume answer is no, but I think it is worthwhile to say.

In principle, they are different types, and the existing practice with small_vector does not support those operations. However, all small_vectors derive from small_vector_base which is movable and has to support the operation with different SSO optimizations. So maybe it's a doable and desired feature...

...

4. Document how can one run benchmarks on Container. I am not really a boost developer, but often I like to play with benchmarking code, so would be nice to know how hard/easy is to run some benchmarks on my machine.

The benchmarking part should be improved and simplified, yes. Having a new Boost.Bench library would certainly help instead of inventing our own machinery...

...

5. should T in containers be constrained - I presume answer is no, since I presume concepts slow down compilation, but I thought it should be discussed, in case I presumed wrong :)

Not sure if this feature offers enough advantages, I haven't received any feature request asking for this.

...

6. Are there any macro hacks to stamp out something(e.g. variadic templates emulation) whose cleanup would help with compile times.

If variadic templates are available, the all emulation code is not compiled, the preprocessor just skips all the machinery. I don't think compile times would be improved, probably the maintainance cost would be lower.

...

7. Compile times - Unfortunately std:: implementors have privilege of including internal headers, Boost does not(although some people do tricks like declare std:: stuff or include internal headers and that can lead to breakages when std people reorganize internal headers). This means that generally std:/boost:: version of same thing do not cost same to compile, and often boost compile times are worse.

It could be, but at least for some std implementation I think Boost container compile times are quite fast, and preprocessor produced code is in many times smaller. But of course, this varies a lot with the standard library type and version...

...

8. vector32: May be stupid idea with "noise" performance improvement, but I was always annoyed that vector uses 2 64 bit fields for size and capacity when(unless you use small types like char or short or int) my

Already possible with Boost.Container for vector-like containers (vector, static_vector, small_vector, devector...)

...

9. resize_and_overwrite- ugly name, beautiful performance allegedly... I think it may be useful to add it so people can do std::format_to_n to SSO buffer of string without paying a copy(not to mention that if people use std::format std::string in it's api so you need to give them nice escape ;) )

Ok, I missed this feature from C++23.

...

10. I asked people before on mailing list for std::optional<T&> API for Boost.Unordered, not a lot of fans although I was not the only proponent. C++ might get it <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3091r4.html#other-names-for-get>, but not before C++29... and it might get an uglier name than get. I still think it is great api for any associative container.

Not a fan. For two reasons: - I don't like the idea of adding dependencies to containers. The beauty of the STL design was the iterator abstration that does not like a container with an external type. It's not a good design decision IMHO. - Some users would like to use boost::optional or other optional instead of the standard one.

...

11. starts_with and ends_with for string. Issue here is that std:: has 3 overloads, one takes std::string_view, do you use boost or std::string_view? I would suggest std::one, but that makes the code C++17. Beside that implementation is trivial beside checking what happens for nullptr input.

I missed those C++20 features, thanks. Maybe we can support a view-like template argument and support both.

...

12. Regarding support for pool in containers: might bet worthwhile to investigate if using indexes into a pool(as long as pool is contiguous) instead of pointers to save on child pointer sizes. I am aware it involves extra work to get final pointer.

Seems quite specific in most cases pools are not contiguous...

...

13. Fuzzing would be nice, esp for rewrites, since we can presume users reported most of the bugs in older code, although obviously ideal is to fuzz everything.

Right, we would need to lay out an strategy to select what code parts should we fuzz first to obtain significant results with the minimum effort.

...

14. SecureString - tbh I am worried about maintenance of this guy, it could be just my paranoia since IDK how elaborate future attacks(or compiler optimizations) might realistically be i.e. is it possible that secure erasure gets broken and it needs a hotfix asap.

Right, I agree.

...

15. GermanString - I would suggest picking better name, unless it really sticks in the community. https://www.reddit.com/r/programming/comments/1e5gzq2/comment/ldmj266/

Thanks for the link, definitely the "german" name does not transmit the idea behind.

...

Sorry for the manifesto, I do not expect you to go over each point and spend 20 min on each, this is just a quick dump of my first reactions when reading the document.

Many thanks for your review, I appreciate you time and effort, I think your comments will be very useful to improve the roadmap of the library. Best, Ion

Age (days ago)

110

Last active (days ago)

List overview

10 comments

6 participants

participants (6)

Andrey Semashev
Ceyhun Erturk
Ion Gaztañaga
Ivan Matek
LoS
Seth

[container] New deque implementation and future plans

tags

participants (6)