|
Boost : |
Subject: Re: [boost] [locale] review part 2.2: source
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-04-15 10:04:31
>
> shared/date_time.cpp:
>
> line 79: If it's an invariant that tz_ is
> the same as the timezone of impl, then
> this is not exception safe if clone throws.
> You'll have to use copy and swap idiom.
>
Noted
> line 161: This is a bit odd, don't you think? Normally
> if assignment is strongly exception safe without
> doing any extra work, there's no reason to go out
> of your way to clear the old state on failure.
>
Yes, you are right.
> line 350: should you check that v is in range?
> Regardless, the valid range should be documented.
>
The valid range is actually depends on specific calendar
and its implementation. Sometimes it depends on actual
calendar or for example in Hebrew calendar it
can't be year less then ~5800 now as it is
when the world was born (according to the calendar)
So if the range is not valid the underlying implementation
should throw.
> shared/format.cpp:
>
> line 53: it seems that this doesn't restore the
> float field, base field etc.
Ok... I'll check this.
>
> shared/formatting.cpp:
>
> line 42: normally copy and swap is the preferred
> way to implement assignment. The main
> advantage is that it guarantees that
> if the operation fails, the state of
> the lhs is unchanged.
>
Noted... This is something across all the code.
> shared/generator.hpp:
>
> line 41: From the way domains is used, would a std::set
> be more appropriate?
>
There should be at lease one that is default domain.
It is the first one so in any case I would need to mark
something as default.
>
> shared/ids.cpp:
>
> missing copyright/license.
> (Issues like this will be caught by
> the inspect tool, so not a big deal now.)
>
Yes you are right.
> shared/ios_prop.hpp:
>
> line 30 and others,: casting to and from void* should use
> static_cast.
Noted.
>
> line 56: can you use invalid instead of hard-coding
> this constant?
Opps, you are right.
>
> line 96: I don't think this is guaranteed to
> be distinct from any valid pointer. Can
> you make invalid the address of a unique
> global object instead?
>
This method is used quite widely, for example mmap
returns (void *)(-1) in case of failure, and it can't be anything
invalid.
On the other hand there is enough problems with unique
global objects there.
> shared/localization_backend.cpp:
>
> lines 200, 206: This is not guaranteed to be thread-safe
> in C++03. You must use call_once for thread-safe
> static inititalization. Note however that this is
> probably safe if you assume that no additional
> threads are started before main.
>
This mutexes are explicitly globally initialized
later.
So all such things should be solved on library start.
So I think that is would be good enough and call_once
would just inject additional dependency on Boost.Thread
(currently Boost.Thread used in headers only variant
when ICU is not used)
> shared/message.cpp:
>
> line 123: I take it that the format guarantees that
> hash_size_ > 2? Please validate this requirement
> when loading.
Good point
>
> line 132: Use null_pair?
>
Noted
> line 152: You require that off is in range, that
> off <= off + len, and that off + len is in range.
> Please check this.
>
Good point.
> line 204: You really should check that you actually read
> 4 bytes. Otherwise if the file is too short, you get
> undefined behavior from reading uninitialized memory.
>
Actually variable magic is initialized (0) and if it was
not read correctly its value wouldn't be valid.
That is why I don't check the returned value.
> line 311: How do you know that there's a null-terminator?
>
Promised by the file format.
> line 326: suppose that form is 0 and the null-terminator
> is missing...
The different plural forms are separated by 0.
>
> line 407: This isn't guaranteed to work. The
> standard doesn't guarantee that the letters
> are contiguous. (Does anything use EBCDIC
> these days?)
I'm not sure that something better and simple
enough can be done.
I assume we are ASCII at least.
>
> lines 440, 442: You're assuming that the string is
> null terminated.
> What if the file is incorrectly
> formatted? It may be possible to read past
> the end of the allocated block. I haven't
> actually worked through the exact code path
> to trigger this condition, but it seems suspicious.
>
It is the requirement of the format, it is designed to
be loaded as is to memory and be useful.
So I don't check every possible string as it allows
to load files much faster.
In any case when I load file I put 1 extra 0 at the end
so at some point it will be terminated
> line 475: You're assuming that the separator
> actually appears.
std::string::npos is maximal value of size_t
so substr just get bigger sub string. May be
not 100% clear but this code does what it should.
>
> line 500: I think this can cause alignment problems.
> Does the file format make any alignment guarantees?
> Even if it does, you'll need to handle invalid
> files gracefully. Of course, it'll only be
> inefficient on platforms that allow unaligned
> loads, but still, the standard doesn't guarantee
> that it'll work.
Actually there two types of catalogs:
1. Directly loaded - when the locale's 8 bit charset and
the catalog's one is the same like UTF-8.
2. In memory when the base character is not char (wchar_t or charXX_t)
or the catalog had to be converted to other character set.
So yes, these pointers are going to be aligned.
>
> shared/mo_hash.hpp:
>
> shared/mo_lambda.cpp:
>
> line 239: There's no 'c' in tokenizer.
>
Ooops. I'll fix this.
> Everything except compile can be in the unnamed namespace.
>
Yes you are right.
> shared/mo_lambda.hpp:
>
> std/all_generater.hpp:
>
> std/codecvt.cpp:
>
> std/collate.cpp:
>
> std/converter.cpp:
>
> std/numeric.cpp:
>
> std/std_backend.cpp:
>
> std/std_backend.hpp:
>
> util/codecvt_converter.cpp:
>
> util/default_locale.cpp:
>
> util/iconv.hpp:
>
> Is this header intended to be used in multiple
> translation units? If so you probably should
> make the functions inline and not use an
> unnamed namespace.
Yes you are right. I don't know why had I done it
this way.
>
> util/info.hpp:
>
> util/locale_data.hpp:
>
> util/locale_data.cpp:
>
> util/numeric.cpp:
>
> util/timezone.hpp:
>
> win32/all_generator.hpp:
>
> win32/api.hpp:
>
> line 49: Don't most Windows functions use DWORD,
> rather than int?
>
Yes. You are right.
> win32/collate.hpp:
>
> line 48: I think I just saw code looking just
> like this in another file.
>
There is no win32/collate.hpp?
> win32/converter.cpp:
>
> win32/lcid.cpp:
>
> line 51: Check whether this succeeds and uses
> the entire string?
Ok
>
> line 89: I thought double checked locking was
> considered unsafe?
Not really
I had a discussion with a professor an expert in
this area, no problems when mutexes are full
memory barriers and they are.
> Anyway you should use
> call_once for initialization like this.
>
> win32/lcid.hpp:
>
> win32/numeric.cpp:
>
> line 32: I think I saw another copy of these functions.
> Get rid of the duplication?
AFAIR, they had different semantics.
>
> win32/win_backend.cpp:
>
> win32/win_backend.hpp:
>
> build/Jamfile.v2:
>
> line 30: I don't think you need a separate
> object file. It isn't used anywhere else.
It does because it is used in different targets
AFAIR is solved this problem.
>
> line 126: A simpler way to reference another
> library is /boost//thread
Ok
>
> line 211: I would tend to use
> icu-sources = [ glob icu/*.cpp ] ;
> result += <source>$(icu-sources)
> I see you're using something like
> this later on.
>
The problem comes when you want to put something
temporary there out of project.
Never liked glob
> line 278: Boost.Thread ought to set this flag
> automatically. I'm assuming from the fact
> that you have it, that it doesn't.
> Looks
> like a bug in the thread Jamfile. (Similarly,
> you should add <define>BOOST_LOCALE_NO_LIB
> in the usage requirements of boost_locale)
I explain, I use boost.thread's mutexes but I don't
want to link with this library. So in many cases
I use header-only boost.thread so I prefer specify
explicitly that I don't need Boost.Thread
(Boost.Thread is linked only for ICU backend where TLS is needed)
>
> test/Jamfile.v2: You don't actually need
> test-suite. test-suite is exactly the
> same as alias.
>
What do you mean?
> In Christ,
> Steven Watanabe
Thank you very much, it was really very
good code review.
Artyom
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk