A string_view with null-terminator guarantee?

newer
[static_string] Removing support...

older
Minimum CMake version raised to 3.8

Andrzej Krzemienski

24 Sep 2025 24 Sep '25

7:20 p.m.

Hi Everyone, Would Boost libraries benefit from having a string_view-like type with the guarantee that the underlying character sequence is null-terminated? This could be as yet another Boost library, or as part of Boost.Core. Maybe Boost already has such a type somewhere? The need for such a type surfaces time and again. When a library needs to be passed both const char * and std::string cheaply, only to pass it down to low-level C-libraries later. We have seen one recently during the review of Boost.Sqlite. One is again proposed for standardization: wg21.link/p3655. Regards, &rzej;

Show replies by date

Emil Dotchevski

24 Sep 24 Sep

7:24 p.m.

Can a zero-terminated string view return the size? On Wed, Sep 24, 2025 at 3:22 PM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:

...

Hi Everyone, Would Boost libraries benefit from having a string_view-like type with the guarantee that the underlying character sequence is null-terminated?

This could be as yet another Boost library, or as part of Boost.Core. Maybe Boost already has such a type somewhere?

The need for such a type surfaces time and again. When a library needs to be passed both const char * and std::string cheaply, only to pass it down to low-level C-libraries later. We have seen one recently during the review of Boost.Sqlite. One is again proposed for standardization: wg21.link/p3655.

Regards, &rzej; _______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/E3D7ZHSI...

Andrzej Krzemienski

7:40 p.m.

śr., 24 wrz 2025 o 21:26 Emil Dotchevski via Boost <boost@lists.boost.org> napisał(a):

...

Can a zero-terminated string view return the size?

The "zero-terminated string view", much like std::string_view, holds a pointer and a size, so it can return the size in constant time. (It also has to maintain an invariant that the null-terminator mark is consistent with the separately stored size.) Regards, &rzej;

...

On Wed, Sep 24, 2025 at 3:22 PM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:

...
Hi Everyone, Would Boost libraries benefit from having a string_view-like type with the guarantee that the underlying character sequence is null-terminated?

This could be as yet another Boost library, or as part of Boost.Core. Maybe Boost already has such a type somewhere?

The need for such a type surfaces time and again. When a library needs to be passed both const char * and std::string cheaply, only to pass it down to low-level C-libraries later. We have seen one recently during the review of Boost.Sqlite. One is again proposed for standardization: wg21.link/p3655.

Regards, &rzej; _______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at:

https://lists.boost.org/archives/list/boost@lists.boost.org/message/E3D7ZHSI...

...
_______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/E6Q6K2QR...

Klemens Morgenstern

25 Sep 25 Sep

1:15 a.m.

On Thu, Sep 25, 2025 at 3:42 AM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:

...

śr., 24 wrz 2025 o 21:26 Emil Dotchevski via Boost <boost@lists.boost.org> napisał(a):

...
Can a zero-terminated string view return the size?

The "zero-terminated string view", much like std::string_view, holds a pointer and a size, so it can return the size in constant time.

(It also has to maintain an invariant that the null-terminator mark is consistent with the separately stored size.)

Just a minor comment regarding my use-case: When passing the string on to a C-API, calculating the size is unnecessary & would introduce overhead. For example: void foo(cstring_view value) { c_foo(value.c_str()); } foo("very long string"); Would calculate the strlen for no reason. That doesn't mean the cstring_view shouldn't calculate the size, I am just saying that my cstring_ref written specifically for this one use-case doesn't keep a size on purpose.

Andrzej Krzemienski

4:07 a.m.

czw., 25 wrz 2025 o 03:17 Klemens Morgenstern via Boost < boost@lists.boost.org> napisał(a):

...

On Thu, Sep 25, 2025 at 3:42 AM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:

...
śr., 24 wrz 2025 o 21:26 Emil Dotchevski via Boost < boost@lists.boost.org> napisał(a):

...
Can a zero-terminated string view return the size?

The "zero-terminated string view", much like std::string_view, holds a pointer and a size, so it can return the size in constant time.

(It also has to maintain an invariant that the null-terminator mark is consistent with the separately stored size.)

Just a minor comment regarding my use-case: When passing the string on to a C-API, calculating the size is unnecessary & would introduce overhead.

For example:

void foo(cstring_view value) { c_foo(value.c_str()); }

foo("very long string");

Would calculate the strlen for no reason.

That doesn't mean the cstring_view shouldn't calculate the size, I am just saying that my cstring_ref written specifically for this one use-case doesn't keep a size on purpose.

This is a very important observation. One "optimal" implementation of a null-terminated string_view may not be achievable. People will have various incompatible expectations, like: minimum compile-times, minimum dependencies versus, full string compatibility, including .at(), which drags std::runtime_error, which std::string, which drags memory allocation. Another would be, whether you get std::format support by default. Regards, &rzej;

...

_______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/QGB3TK2Z...

Mohammad Nejati

6:33 a.m.

On Thu, Sep 25, 2025 at 7:39 AM Andrzej Krzemienski via Boost <boost@lists.boost.org> wrote:

...

...
For example:

void foo(cstring_view value) { c_foo(value.c_str()); }

foo("very long string");

Would calculate the strlen for no reason.

That doesn't mean the cstring_view shouldn't calculate the size, I am just saying that my cstring_ref written specifically for this one use-case doesn't keep a size on purpose.

This is a very important observation. One "optimal" implementation of a null-terminated string_view may not be achievable.

Wouldn't adding an array constructor resolve the problem? template <size_t N> cstring_view(const char (&str)[N]); Keeping size as a member seems redundant, since we only use this type in an interface for two purposes: first, to make it explicit that the arguments require a null-terminated string, which also improves documentation, and second, to check the invariant that the passed string is indeed null-terminated. In practice, we just pass instances of this type as const char* to a C-API style interface that does not take a length. So what would be the use case for an O(1) size() access? Also, most use cases for null-terminated strings involve small strings, otherwise, the C-API would likely have taken a length parameter as well.

Klemens Morgenstern

8:32 a.m.

On Thu, Sep 25, 2025 at 2:34 PM Mohammad Nejati via Boost < boost@lists.boost.org> wrote:

...

On Thu, Sep 25, 2025 at 7:39 AM Andrzej Krzemienski via Boost <boost@lists.boost.org> wrote:

...
...
For example:

void foo(cstring_view value) { c_foo(value.c_str()); }

foo("very long string");

Would calculate the strlen for no reason.

That doesn't mean the cstring_view shouldn't calculate the size, I am just saying that my cstring_ref written specifically for this one use-case doesn't keep a size on purpose.

This is a very important observation. One "optimal" implementation of a null-terminated string_view may not be achievable.

Wouldn't adding an array constructor resolve the problem? template <size_t N> cstring_view(const char (&str)[N]);

It would solve a lot, but if we're interacting with a C-API getting a null-terminated string to pass on is not uncommon either. So not every null-terminated string in this context would have it's size known at compile time.

...

Keeping size as a member seems redundant, since we only use this type in an interface for two purposes: first, to make it explicit that the arguments require a null-terminated string, which also improves documentation, and second, to check the invariant that the passed string is indeed null-terminated. In practice, we just pass instances of this type as const char* to a C-API style interface that does not take a length. So what would be the use case for an O(1) size() access? Also, most use cases for null-terminated strings involve small strings, otherwise, the C-API would likely have taken a length parameter as well.

Sure, but that argument cuts both ways. That is, it's cheap enough for the constructor, but it's also cheap enough for the `size` function depending on the application. I think there are two ways to think about this: 1. a type-safe alternative to const char * (no size_t) 2. a std::string_view that guarantees a null at the end (with size_t)

Dominique Devienne

8:58 a.m.

On Thu, Sep 25, 2025 at 10:34 AM Klemens Morgenstern via Boost <boost@lists.boost.org> wrote:

...

1. a type-safe alternative to const char * (no size_t)

What's not type-safe about `const char*`?

...

2. a std::string_view that guarantees a null at the end (with size_t)

That one is useful for a Boost.SQLite wrapper. But not with a no-embedded-null requirement, as SQLite is just fine storing those. Yes, it's string-based built-in SQL functions would stop of the first NULL (embedded or "final"), but that's a different issue. And stored text values in SQLite are supposed to be in UTF-8 (or UTF-16), but that's not enforced! The different between blob and text is several small, both can store arbitrary content, it's just that text can have custom collations, and might be displayed differently by programs, including the sqlite3 CLI, but fundamentally, both know their size, always, and can store arbitrary content. Some places in the SQLite API do require a null-terminated string, with no option to explicit give a size, but in most places it matters, the size is explicit. --DD

Peter Dimov

9:11 a.m.

Dominique Devienne wrote:

...

On Thu, Sep 25, 2025 at 10:34 AM Klemens Morgenstern via Boost <boost@lists.boost.org> wrote:

...
1. a type-safe alternative to const char * (no size_t)

What's not type-safe about `const char*`?

...
2. a std::string_view that guarantees a null at the end (with size_t)

That one is useful for a Boost.SQLite wrapper.

But not with a no-embedded-null requirement, as SQLite is just fine storing those.

After giving this some thought, I think that both zstring_view classes (one that enforces no embedded nulls, and one that doesn't) are useful for different things and probably should be provided. (I'm using zstring_view for the name because the standard proposal https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3655r2.html does so.)

Dominique Devienne

9:18 a.m.

On Thu, Sep 25, 2025 at 11:13 AM Peter Dimov via Boost <boost@lists.boost.org> wrote:

...

Dominique Devienne wrote:

...
On Thu, Sep 25, 2025 at 10:34 AM Klemens Morgenstern via Boost <boost@lists.boost.org> wrote:

...
2. a std::string_view that guarantees a null at the end (with size_t) That one is useful for a Boost.SQLite wrapper.

After giving this some thought, I think that both zstring_view classes (one that enforces no embedded nulls, and one that doesn't) are useful for different things and probably should be provided.

Makes sense. Note that specifically in the Boost.SQLite case, size_t where it's 64-bit is wasteful, as SQLite blob/text is limited to 1GB by default, and can't exceed 2GB when raising the limit at build time, AFAIK.

Dominique Devienne

9:34 a.m.

On Thu, Sep 25, 2025 at 11:18 AM Dominique Devienne <ddevienne@gmail.com> wrote:

...

Makes sense. Note that specifically in the Boost.SQLite case, size_t where it's 64-bit is wasteful, as SQLite blob/text is limited to 1GB by default, and can't exceed 2GB when raising the limit at build time

BTW, PostgreSQL's bytea is similar, limited to 1GB. Anything larger must be manually sharded, which we do (on both SQLite and PostgreSQL). In practice, shard sizes are nowhere near 1GB in size. All that to say that size_t is overkill in these contexts. But it probably doesn't matter, size_t it will be I guess, if we want a type that's not a template, and usable in any context.

Peter Dimov

9:38 a.m.

Dominique Devienne wrote:

...

On Thu, Sep 25, 2025 at 11:18 AM Dominique Devienne <ddevienne@gmail.com> wrote:

...
Makes sense. Note that specifically in the Boost.SQLite case, size_t where it's 64-bit is wasteful, as SQLite blob/text is limited to 1GB by default, and can't exceed 2GB when raising the limit at build time

BTW, PostgreSQL's bytea is similar, limited to 1GB. Anything larger must be manually sharded, which we do (on both SQLite and PostgreSQL). In practice, shard sizes are nowhere near 1GB in size. All that to say that size_t is overkill in these contexts. But it probably doesn't matter, size_t it will be I guess, if we want a type that's not a template, and usable in any context.

It doesn't really matter because the data() member is 64 bit anyway, so the entire type will be 128 bit regardless of whether the size() is uint32_t or size_t.

Andrzej Krzemienski

7:43 p.m.

czw., 25 wrz 2025 o 10:59 Dominique Devienne via Boost < boost@lists.boost.org> napisał(a):

...

On Thu, Sep 25, 2025 at 10:34 AM Klemens Morgenstern via Boost <boost@lists.boost.org> wrote:

...
1. a type-safe alternative to const char * (no size_t)

What's not type-safe about `const char*`?

...
2. a std::string_view that guarantees a null at the end (with size_t)

That one is useful for a Boost.SQLite wrapper.

But not with a no-embedded-null requirement, as SQLite is just fine storing those.

Yes, it's string-based built-in SQL functions would stop of the first NULL (embedded or "final"), but that's a different issue.

So, if table data can contain null characters in the middle, how can I retrieve full strings in Boost.SQLite? Do I need to use a blob? Regards, &rzej;

...

And stored text values in SQLite are supposed to be in UTF-8 (or UTF-16), but that's not enforced!

The different between blob and text is several small, both can store arbitrary content, it's just that text can have custom collations, and might be displayed differently by programs, including the sqlite3 CLI, but fundamentally, both know their size, always, and can store arbitrary content.

Some places in the SQLite API do require a null-terminated string, with no option to explicit give a size, but in most places it matters, the size is explicit. --DD _______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/FIRTD3MX...

Peter Dimov

9:15 a.m.

Klemens Morgenstern wrote:

...

Sure, but that argument cuts both ways. That is, it's cheap enough for the constructor, but it's also cheap enough for the `size` function depending on the application.

You need strlen for more than size(). It's also needed in end(), operator[], at(), remove_prefix, substr (both overloads), the find functions taking a start offset, back(), actually everything taking a start offset, ends_with, and probably others as well.

Richard Hodges

26 Sep 26 Sep

7:30 a.m.

On Thu, 25 Sept 2025 at 17:15, Peter Dimov via Boost <boost@lists.boost.org> wrote:

...

You need strlen for more than size(). It's also needed in end(), operator[], at(), remove_prefix, substr (both overloads), the find functions taking a start offset, back(), actually everything taking a start offset, ends_with, and probably others as well.

If the purpose of the class is to provide the functionality of a c string, then the only required methods are those available on a c string: - data - array indexing - length There is no need for any string manipulation. In C++,.the correct class for string manipulation operations is a std::string There is a reasonable argument that you may want begin() and end() iterators, maybe. But it's reasonable to write end() in terms of strlen(), because it's unlikely that a sane program will need to call end() more than once, and also unlikely that it will need to call both end() and size(), when dealing with a c string. Computing the length of a string with strlen() is very high performance for anything other than an extremely long string. I don't think this thing needs to be over-engineered. I am of the view that providing a full std::string interface is an utter waste of time and effort. R

...

_______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/TFKGXSRI...

Andrzej Krzemienski

8:17 a.m.

pt., 26 wrz 2025 o 09:32 Richard Hodges via Boost <boost@lists.boost.org> napisał(a):

...

On Thu, 25 Sept 2025 at 17:15, Peter Dimov via Boost < boost@lists.boost.org> wrote:

...
You need strlen for more than size(). It's also needed in end(), operator[], at(), remove_prefix, substr (both overloads), the find functions taking a start offset, back(), actually everything taking a start offset, ends_with, and probably others as well.

If the purpose of the class is to provide the functionality of a c string, then the only required methods are those available on a c string: - data - array indexing - length

There is no need for any string manipulation. In C++,.the correct class for string manipulation operations is a std::string

There is a reasonable argument that you may want begin() and end() iterators, maybe. But it's reasonable to write end() in terms of strlen(), because it's unlikely that a sane program will need to call end() more than once, and also unlikely that it will need to call both end() and size(), when dealing with a c string.

Computing the length of a string with strlen() is very high performance for anything other than an extremely long string.

I don't think this thing needs to be over-engineered. I am of the view that providing a full std::string interface is an utter waste of time and effort

Agreed about the full interface (I don't think I even know the full interface of std::string). Considering the subset of SQLite use-cases, those where a user has a `const char*` or an std::string, passes it to the library and the library will ultimately use it to call a C library and pass a `const char *`, I have the following expectations: E1: Distinguish (in the type) the situation where `const char*` is used to represent a null-terminated character sequence from the situation where `const char*` is used as a pointer to object which happens to be of type `char`: ``` const char* delimited = "delimited_data.dat"; constexpr char regular_delimiter = ';'; constexpr char special_delimiter = 0; auto delimiter = use_regular ? & regular_delimiter : use_special ? & special_delimiter: nullptr; // ... fopen(delimiter, "r"); // I want compiler error here ``` E2: Shorter function call syntax: when I have a std::string, I would not like to spell `.c_str()` unnecessarily. E3: I need the operator== (and cousins) to have the intuitive meaning of comparing character sequences rather than addresses. I also note, although it is not crucial for me privately, that having the length of the string stored directly, allows the class invariant to be executable, checkable and enforceable at runtime. This means that the program owner can compile it in "assertions enabled" mode, and have the stringg's invariant checked at runtime and potentially detect a bug. I also note that Boost.Sqlite has other use cases, where a text column from a table, potentially containing null characters in the middle, is returned to the user. This is news to me, so I am not even sure yet what I would expect there. Another observation is that the key selling point of the std::string_view was that any substring can be computed and returned without any memory management issues. This property is fundamentally incompatible with the null-terminator guarantee. One last observation, although it is probably just academic, is that the range iteration could be provided without an explicit size: by using a "null-terminator-detecting sentinel". But this would no longer be a contiguous range. Regards, &rzej;

Peter Dimov

10:40 a.m.

Andrzej Krzemienski wrote:

...

I also note, although it is not crucial for me privately, that having the length of the string stored directly, allows the class invariant to be executable, checkable and enforceable at runtime. This means that the program owner can compile it in "assertions enabled" mode, and have the stringg's invariant checked at runtime and potentially detect a bug.

It should be crucial for everyone. In 2025, all our preconditions need to be "hardened preconditions", meaning that (a) they should be checkable and (b) they should be enabled in release builds as well unless there's unacceptable performance cost. There's work being done on that front: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3566r2.pdf https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3711r1.pdf The basic idea here is that if you do char x[ 128 ]; // fill (part of) x with characters std::string_view sv( x ); the constructor can actually see the size of the array (even though it currently doesn't) and therefore the strlen call inside is bounded and cannot lead to arbitrary memory reads as is the case with the normal `char const*` constructor if it's passed something not null terminated.

...

One last observation, although it is probably just academic, is that the range iteration could be provided without an explicit size: by using a "null- terminator-detecting sentinel". But this would no longer be a contiguous range.

It actually can be. That's the textbook example of a contiguous range that is not common (i.e. that is bounded by a sentinel) and not even sized (doesn't have O(1) size().)

Peter Dimov

10:31 a.m.

Richard Hodges wrote:

...

On Thu, 25 Sept 2025 at 17:15, Peter Dimov via Boost <boost@lists.boost.org> wrote:

...
You need strlen for more than size(). It's also needed in end(), operator[], at(), remove_prefix, substr (both overloads), the find functions taking a start offset, back(), actually everything taking a start offset, ends_with, and probably others as well.

If the purpose of the class is to provide the functionality of a c string, then the only required methods are those available on a c string: - data - array indexing - length

There is no need for any string manipulation. In C++,.the correct class for string manipulation operations is a std::string

There is a reasonable argument that you may want begin() and end() iterators, maybe. But it's reasonable to write end() in terms of strlen(), because it's unlikely that a sane program will need to call end() more than once, and also unlikely that it will need to call both end() and size(), when dealing with a c string.

Computing the length of a string with strlen() is very high performance for anything other than an extremely long string.

I don't think this thing needs to be over-engineered. I am of the view that providing a full std::string interface is an utter waste of time and effort.

All the cstring_view/zstring_view implementations, and all the standardization proposals thereof, are converging on it being exactly like string_view, with a null terminator invariant. That's kind of the point of the class; to allow you to use the string_view interface instead of <cstring> functions (as is the case when char const* is used.) The idea that we're somehow delivering enormous value by making it clear that the parameter is not a pointer to a single `char` is misguided; pointers to a single `char` are so rare that a `char const*` is essentially an idiomatic way to denote a null terminated char sequence. Its downside is not lack of type safety.

Julien Blanc

12:06 p.m.

Le 2025-09-26 12:31, Peter Dimov via Boost a écrit :

...

The idea that we're somehow delivering enormous value by making it clear that the parameter is not a pointer to a single `char` is misguided; pointers to a single `char` are so rare that a `char const*` is essentially an idiomatic way to denote a null terminated char sequence. Its downside is not lack of type safety.

Alas `char const*` is also an idiomatic way to point at a bunch of raw bytes, whose size is given elsewhere, and clearly are not null terminated. So the idea that a char const* is not necessarily a c string is IMHO not so much misguided. Regards, Julien

Peter Dimov

12:24 p.m.

Julien Blanc wrote:

...

Le 2025-09-26 12:31, Peter Dimov via Boost a écrit :

...
The idea that we're somehow delivering enormous value by making it clear that the parameter is not a pointer to a single `char` is misguided; pointers to a single `char` are so rare that a `char const*` is essentially an idiomatic way to denote a null terminated char sequence. Its downside is not lack of type safety.

Alas `char const*` is also an idiomatic way to point at a bunch of raw bytes, whose size is given elsewhere, and clearly are not null terminated.

Common (unfortunately), maybe. Idiomatic... maybe not so much. :-)

Andrey Semashev

1:25 p.m.

On 26 Sep 2025 15:24, Peter Dimov via Boost wrote:

...

Julien Blanc wrote:

...
Le 2025-09-26 12:31, Peter Dimov via Boost a écrit :

...
The idea that we're somehow delivering enormous value by making it clear that the parameter is not a pointer to a single `char` is misguided; pointers to a single `char` are so rare that a `char const*` is essentially an idiomatic way to denote a null terminated char sequence. Its downside is not lack of type safety.

Alas `char const*` is also an idiomatic way to point at a bunch of raw bytes, whose size is given elsewhere, and clearly are not null terminated.

Common (unfortunately), maybe. Idiomatic... maybe not so much. :-)

I'd say using uint8_t or unsigned char for raw bytes is more common and idiomatic. Well, there's also std::byte but that came way too late to the party, so is basically useless.

Christian Mazakas

5:15 p.m.

It's kinda funny, we could probably just literally copy-paste what chromium is doing and we'd see some massive wins. I actually think another Boost library _would_ be useful because it'd enable us to start ushering in other chromium patterns, like their span utilities from the Unsafe Buffers guidelines: https://chromium.googlesource.com/chromium/src/+/main/docs/unsafe_buffers.md Functions like `byte_span_from_ref` are absolutely fantastic and amazing to use, and honestly they belong in Boost too. Boost.Core is probably the wrong place because we should want to add a whole bunch of stuff like this and it'd be good to have it all in one nice and easy place. Now, all that being said... we should probably compare the implementations from the std proposals, what chromium is doing and we should also take into account what Rust's CStr is doing as well. These impls have their own divergences we should compare and contrast. - Christian

Joaquin M López Muñoz

5:21 p.m.

El 26/09/2025 a las 19:15, Christian Mazakas via Boost escribió:

...

[...]

Now, all that being said... we should probably compare the implementations from the std proposals, what chromium is doing and we should also take into account what Rust's CStr is doing as well. These impls have their own divergences we should compare and contrast.

I think this is a very sensible approach. All these people have already done some thinking (and, in the case of Chromium and Rust, trying) that we can take advantage of. Joaquín M López Muñoz

Andrzej Krzemienski

30 Sep 30 Sep

3:40 p.m.

pt., 26 wrz 2025 o 19:22 Joaquin M López Muñoz via Boost < boost@lists.boost.org> napisał(a):

...

El 26/09/2025 a las 19:15, Christian Mazakas via Boost escribió:

...
[...]

Now, all that being said... we should probably compare the implementations from the std proposals, what chromium is doing and we should also take into account what Rust's CStr is doing as well. These impls have their own divergences we should compare and contrast.

I think this is a very sensible approach. All these people have already done some thinking (and, in the case of Chromium and Rust, trying) that we can take advantage of.

I would also use Boost.SQLite as a litmus test. If the additional size member is not acceptable, and the library has to roll its own type, this would mean that you cannot have one null-terminated string view that is universally useful. Regards, &rzej;

Christian Mazakas

7:49 p.m.

On Tue, Sep 30, 2025 at 8:41 AM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:

...

I would also use Boost.SQLite as a litmus test.

Personally, I wouldn't. - Christian

Virgilio Fornazin

8:50 p.m.

as far as I know sqlite api receives (str / length) pairs... why a c-style string_view is required? On Tue, Sep 30, 2025 at 4:51 PM Christian Mazakas via Boost < boost@lists.boost.org> wrote:

...

On Tue, Sep 30, 2025 at 8:41 AM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:

...
I would also use Boost.SQLite as a litmus test.

Personally, I wouldn't.

- Christian _______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/2QQAT73C...

Andrzej Krzemienski

9:01 p.m.

wt., 30 wrz 2025 o 22:51 Virgilio Fornazin via Boost <boost@lists.boost.org> napisał(a):

...

as far as I know sqlite api receives (str / length) pairs...

It doesn't. For instance, have a look at sqlite3_open: https://www.sqlite.org/c3ref/open.html It takes `const char*`. Regards, &rzej;

...

why a c-style string_view is required?

On Tue, Sep 30, 2025 at 4:51 PM Christian Mazakas via Boost < boost@lists.boost.org> wrote:

...
On Tue, Sep 30, 2025 at 8:41 AM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:

...
I would also use Boost.SQLite as a litmus test.

Personally, I wouldn't.

- Christian _______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at:

https://lists.boost.org/archives/list/boost@lists.boost.org/message/2QQAT73C...

...
_______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/W5RFPG44...

Andrzej Krzemienski

9:02 p.m.

wt., 30 wrz 2025 o 21:51 Christian Mazakas via Boost <boost@lists.boost.org> napisał(a):

...

On Tue, Sep 30, 2025 at 8:41 AM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:

...
I would also use Boost.SQLite as a litmus test.

Personally, I wouldn't.

Why not? - &rzej;

...

- Christian _______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/2QQAT73C...

Vinnie Falco

9:10 p.m.

On Tue, Sep 30, 2025 at 2:07 PM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:

...

...
...
I would also use Boost.SQLite as a litmus test.

Personally, I wouldn't.

Why not?

I would not either. A view to a null terminated string as a vocabulary type has enough value that I would want to design it in a way that works for all libraries not just sqlite3. Thanks

Seth

25 Sep 25 Sep

2:58 p.m.

On Thu, Sep 25, 2025, at 8:33 AM, Mohammad Nejati via Boost wrote:

...

Wouldn't adding an array constructor resolve the problem? template <size_t N> cstring_view(const char (&str)[N]);

Yes, but `cstring_view("Hello\0World!")` would not do the right thing¹. https://godbolt.org/z/GMjzb4eEe ¹ (without metaprogramming hacks which might not always be available. Also do we really want this to be a header that shows up in compilation profiles :))

Andrey Semashev

24 Sep 24 Sep

8:56 p.m.

On 24 Sep 2025 22:20, Andrzej Krzemienski via Boost wrote:

...

Hi Everyone, Would Boost libraries benefit from having a string_view-like type with the guarantee that the underlying character sequence is null-terminated?

I think, this would be a very useful addition. Not only for Boost libraries, but also for Boost users.

...

This could be as yet another Boost library, or as part of Boost.Core. Maybe Boost already has such a type somewhere?

A separate library seems more appropriate. With a review.

Richard Hodges

10:10 p.m.

On Thu, 25 Sept 2025 at 04:57, Andrey Semashev via Boost < boost@lists.boost.org> wrote:

...

A separate library seems more appropriate. With a review.

Shouldn't this go into boost.core?

Andrey Semashev

10:31 p.m.

On 25 Sep 2025 01:10, Richard Hodges via Boost wrote:

...

On Thu, 25 Sept 2025 at 04:57, Andrey Semashev via Boost < boost@lists.boost.org> wrote:

...
A separate library seems more appropriate. With a review.

Shouldn't this go into boost.core?

Boost.Core has strict requirements on the allowed dependencies, which might not be fitting for the new library (now or in the future). And we probably shouldn't inflate Boost.Core too much. I think, this component deserves a separate library.

Ion Gaztañaga

25 Sep 25 Sep

3:38 p.m.

El 25/09/2025 a las 0:31, Andrey Semashev via Boost escribió:

...

On 25 Sep 2025 01:10, Richard Hodges via Boost wrote:

...
On Thu, 25 Sept 2025 at 04:57, Andrey Semashev via Boost < boost@lists.boost.org> wrote:

...
A separate library seems more appropriate. With a review.

Shouldn't this go into boost.core?

Boost.Core has strict requirements on the allowed dependencies, which might not be fitting for the new library (now or in the future). And we probably shouldn't inflate Boost.Core too much. I think, this component deserves a separate library.

I politely disagree. Having a library just for one utility is an unnecessary overhead and dependency, I would not like to repeat Boost.Assert and Boost.StaticAssert cases.... Best, Ion

Andrey Semashev

3:56 p.m.

On 25 Sep 2025 18:38, Ion Gaztañaga wrote:

...

El 25/09/2025 a las 0:31, Andrey Semashev via Boost escribió:

...
On 25 Sep 2025 01:10, Richard Hodges via Boost wrote:

...
On Thu, 25 Sept 2025 at 04:57, Andrey Semashev via Boost < boost@lists.boost.org> wrote:

...
A separate library seems more appropriate. With a review.

Shouldn't this go into boost.core?

Boost.Core has strict requirements on the allowed dependencies, which might not be fitting for the new library (now or in the future). And we probably shouldn't inflate Boost.Core too much. I think, this component deserves a separate library.

I politely disagree. Having a library just for one utility is an unnecessary overhead and dependency, I would not like to repeat Boost.Assert and Boost.StaticAssert cases....

The proposed component will be more heavyweight than Boost.Assert and Boost.StaticAssert. I expect it to be in the same order as Boost.Any, Boost.Optional, Boost.Variant, Boost.Variant2. And judging by the ongoing discussion, the proposed library may end up providing more than one vocabulary type.

Claudio DeSouza

26 Sep 26 Sep

1:21 a.m.

Hey, A type like this is extremely important, especially because `string_view` easily becomes a footgun when one doesn't realise that `data()` is not guaranteed to be null-terminated, or that the size itself is not pointing to the actual null-terminator. Additionally, a cstring view type helps with unnecessary conversions to string with the intent to guarantee that a string is null-terminated. There's also a lot to be said that using raw `const char*` is a source of problems, so you want to treat null-terminated strings as a contiguous range, similar to span, and where there are no surprises about the length, but at the same time with all the goodies that come with a string-specific type. Chromium actually offers a type like this, and there are even extensions to Chromium's span type to handle conversion to byte span that should or should not include the null-terminator. For reference, link below: https://source.chromium.org/chromium/chromium/src/+/main:base/strings/cstrin... Claudio. On Wed, Sep 24, 2025 at 8:21 PM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:

...

Hi Everyone, Would Boost libraries benefit from having a string_view-like type with the guarantee that the underlying character sequence is null-terminated?

This could be as yet another Boost library, or as part of Boost.Core. Maybe Boost already has such a type somewhere?

The need for such a type surfaces time and again. When a library needs to be passed both const char * and std::string cheaply, only to pass it down to low-level C-libraries later. We have seen one recently during the review of Boost.Sqlite. One is again proposed for standardization: wg21.link/p3655.

Regards, &rzej; _______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/E3D7ZHSI...

Vinnie Falco

2:01 a.m.

On Thu, Sep 25, 2025 at 6:23 PM Claudio DeSouza via Boost < boost@lists.boost.org> wrote:

...

https://source.chromium.org/chromium/chromium/src/+/main:base/strings/cstrin...

Thank you for finding an example of established practice in the wild. I note that this class has `char const*` and `std::size_t` (for the length) here: https://source.chromium.org/chromium/chromium/src/+/main:base/strings/cstrin... Regards

Klemens Morgenstern

3:42 a.m.

On Fri, Sep 26, 2025 at 9:23 AM Claudio DeSouza via Boost < boost@lists.boost.org> wrote:

...

Hey,

A type like this is extremely important, especially because `string_view` easily becomes a footgun when one doesn't realise that `data()` is not guaranteed to be null-terminated, or that the size itself is not pointing to the actual null-terminator. Additionally, a cstring view type helps with unnecessary conversions to string with the intent to guarantee that a string is null-terminated. There's also a lot to be said that using raw `const char*` is a source of problems, so you want to treat null-terminated strings as a contiguous range, similar to span, and where there are no surprises about the length, but at the same time with all the goodies that come with a string-specific type.

I did check the generated code and if one just wraps a C-API like so: void posix_foo(const char * p); void foo(cstring_view cs) {posix_foo(cs.c_str());} And then calls it with a static string foo("bar"); The optimizer removes the strlen calculation if foo can be inlined. I have only very few places in my code where a cstring_ref gets accepted by a compiled function and those few could easily be modified. That is to say: the strlen overhead doesn't exist in the generated code when inlined. So calculating the size has no downside for my use-case.

...

Chromium actually offers a type like this, and there are even extensions to Chromium's span type to handle conversion to byte span that should or should not include the null-terminator. For reference, link below:

https://source.chromium.org/chromium/chromium/src/+/main:base/strings/cstrin...

I would like an extended signature of this https://source.chromium.org/chromium/chromium/src/+/main:base/strings/cstrin... from `same_as<basic_string<char>>` to `string_like` : template<typename T> concept string_like = requires (const T & t) { {t.size()} -> std::convertible_to<std::size_t>; {t.data()} -> std::same_as<const T::value_type *>; {t.c_str()} -> std::same_as<const T::value_type *>; }; Which would mean that `foo(cstring_view)` can be called with a `boost::urls::url` for example. In my mind this would reflect the behaviour of std::string_view that accepts anything with a datat() && size() function.

Peter Dimov

10:23 a.m.

Klemens Morgenstern wrote:

...

https://source.chromium.org/chromium/chromium/src/+/main:base/strings/cstrin...

...
I would like an extended signature of this https://source.chromium.org/chromium/chromium/src/+/main:base/strings/ cstring_view.h;l=121

from `same_as<basic_string<char>>` to `string_like` :

template<typename T> concept string_like = requires (const T & t) { {t.size()} -> std::convertible_to<std::size_t>; {t.data()} -> std::same_as<const T::value_type *>; {t.c_str()} -> std::same_as<const T::value_type *>; };

Which would mean that `foo(cstring_view)` can be called with a `boost::urls::url` for example. In my mind this would reflect the behaviour of std::string_view that accepts anything with a datat() && size() function.

It doesn't quite. In C++23, it accepts any contiguous sized range (meaning that it also needs begin and end returning a contiguous iterator, not just data and size.) This was a controversial (and wrong, in my opinion) addition that was pushed through and had later to be corrected because it caused obvious issues. https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2516r0.html

Claudio DeSouza

7 Oct 7 Oct

4:06 a.m.

I just noticed that the Beaman Project has an implementation for a cstring_view based on this proposal https://wg21.link/P3655R2. https://github.com/bemanproject/cstring_view Claudio.

Peter Dimov

11:06 a.m.

Claudio DeSouza wrote:

...

I just noticed that the Beaman Project has an implementation for a cstring_view based on this proposal https://wg21.link/P3655R2.

https://github.com/bemanproject/cstring_view

It's P3655R3 now. They renamed it back to cstring_view.

Age (days ago)

106

Last active (days ago)

List overview

40 comments

16 participants

participants (16)

Andrey Semashev
Andrzej Krzemienski
Christian Mazakas
Claudio DeSouza
Dominique Devienne
Emil Dotchevski
Ion Gaztañaga
Joaquin M López Muñoz
Julien Blanc
Klemens Morgenstern
Mohammad Nejati
Peter Dimov
Richard Hodges
Seth
Vinnie Falco
Virgilio Fornazin

A string_view with null-terminator guarantee?

tags

participants (16)