Boost logo

Boost Users :

From: Emil Dotchevski (emildotchevski_at_[hidden])
Date: 2020-06-03 06:22:29


On Tue, Jun 2, 2020 at 7:44 PM Glen Fernandes via Boost-users <
boost-users_at_[hidden]> wrote:
>
> The Boost formal review of Zach Laine's Unicode library, Text,
> commences on June 11th and concludes June 20th. We welcome the
> participation of Boost developers and users, and the C++ community.

Here is my review of Text in the form of a list of notes/questions, in no
particular order.

Reading through the documentation, everything makes sense. The motivation
for each type and operation is easy to understand.

I welcome the use of signed int as index/size type, but I wonder if this
would interact in some negative way when mixed with similar types from the
standard library. I don't think so, but who knows.

I don't like the multi-page format of the documentation. I'd prefer a
single HTML page for easy searching.

What is the purpose of the throw_on_encoding_error type here?
https://github.com/tzlaine/text/blob/b8bb1a1101e7cf53b460671a3ead19516e7636fe/include/boost/text/transcode_iterator.hpp#L35

I searched and couldn't find any place in the documentation that explains
how ErrorHandler is used. Looking at the source code, it's just a function,
but its interface should be documented.

Generally I dislike interfaces with configurable error handling but it
appears that in the case of the conversion iterators this is warranted:
both replacement character and throwing are useful. I wonder though if it
would be better to delete this template parameter and always
output 0xfffd, and in addition provide an iterator adaptor type which can
be used to wrap e.g. an output iterator, and which throws if it sees 0xfffd
being output.

Both .c_str() and .data() should be added to string for compatibility.
Also, it is not true that s.begin() is equivalent to s.c_str(), because in
debug iterators should not be pointers. Perhaps &*s.begin() is equivalent,
but this should not be valid on an empty string (assert), null termination
notwithstanding.

The note about passing ropes by value, I think, is redundant. Logically,
copying a rope gets you an independent copy, the fact that internally the
copies share state is an implementation detail. Therefore, the
documentation can just specify that a rope is as thread-safe as an int.

Speaking of thread-safety, where are the thread-safety guarantees of rope
documented? Where are the relevant tests?

Speaking of tests, I suggest separating the generated test sources in a
subdirectory so things like test/string.cpp are easy to find. Not to
mention github.com truncates the test directory to 1000 files.

The grammar seems wrong here; 'I don't know if you've ever written an undo
system that can do, undo, or redo any command you can think of, in less
than 40 lines of code, but there one is."

"Every text editor you've ever used that handles large files gracefully has
done something to break the file into chunks" What about the buffer gap
algorithm? I think Borland editors used this way back in the day. :)

In conclusion:

I spent a couple of hours studying the documentation and parts of the
source code.

I vote to ACCEPT Text into Boost.



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net