Boost logo

Boost :

Subject: Re: [boost] [locale] review part 2.2: source
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-04-15 10:04:31


>

> shared/date_time.cpp:
>
> line 79: If it's an invariant that tz_ is
> the same as the timezone of impl, then
> this is not exception safe if clone throws.
> You'll have to use copy and swap idiom.
>

Noted

> line 161: This is a bit odd, don't you think? Normally
> if assignment is strongly exception safe without
> doing any extra work, there's no reason to go out
> of your way to clear the old state on failure.
>

Yes, you are right.

> line 350: should you check that v is in range?
> Regardless, the valid range should be documented.
>

The valid range is actually depends on specific calendar
and its implementation. Sometimes it depends on actual
calendar or for example in Hebrew calendar it
can't be year less then ~5800 now as it is
when the world was born (according to the calendar)

So if the range is not valid the underlying implementation
should throw.

> shared/format.cpp:
>
> line 53: it seems that this doesn't restore the
> float field, base field etc.

Ok... I'll check this.

>
> shared/formatting.cpp:
>
> line 42: normally copy and swap is the preferred
> way to implement assignment. The main
> advantage is that it guarantees that
> if the operation fails, the state of
> the lhs is unchanged.
>

Noted... This is something across all the code.

> shared/generator.hpp:
>
> line 41: From the way domains is used, would a std::set
> be more appropriate?
>

There should be at lease one that is default domain.
It is the first one so in any case I would need to mark
something as default.

>
> shared/ids.cpp:
>
> missing copyright/license.
> (Issues like this will be caught by
> the inspect tool, so not a big deal now.)
>

Yes you are right.

> shared/ios_prop.hpp:
>
> line 30 and others,: casting to and from void* should use
> static_cast.

Noted.

>
> line 56: can you use invalid instead of hard-coding
> this constant?

Opps, you are right.

>
> line 96: I don't think this is guaranteed to
> be distinct from any valid pointer. Can
> you make invalid the address of a unique
> global object instead?
>

This method is used quite widely, for example mmap
returns (void *)(-1) in case of failure, and it can't be anything
invalid.

On the other hand there is enough problems with unique
global objects there.

> shared/localization_backend.cpp:
>
> lines 200, 206: This is not guaranteed to be thread-safe
> in C++03. You must use call_once for thread-safe
> static inititalization. Note however that this is
> probably safe if you assume that no additional
> threads are started before main.
>

This mutexes are explicitly globally initialized
later.

So all such things should be solved on library start.

So I think that is would be good enough and call_once
would just inject additional dependency on Boost.Thread
(currently Boost.Thread used in headers only variant
when ICU is not used)

> shared/message.cpp:
>
> line 123: I take it that the format guarantees that
> hash_size_ > 2? Please validate this requirement
> when loading.

Good point
>
> line 132: Use null_pair?
>

Noted

> line 152: You require that off is in range, that
> off <= off + len, and that off + len is in range.
> Please check this.
>

Good point.

> line 204: You really should check that you actually read
> 4 bytes. Otherwise if the file is too short, you get
> undefined behavior from reading uninitialized memory.
>
Actually variable magic is initialized (0) and if it was
not read correctly its value wouldn't be valid.

That is why I don't check the returned value.

> line 311: How do you know that there's a null-terminator?
>

Promised by the file format.

> line 326: suppose that form is 0 and the null-terminator
> is missing...

The different plural forms are separated by 0.

>
> line 407: This isn't guaranteed to work. The
> standard doesn't guarantee that the letters
> are contiguous. (Does anything use EBCDIC
> these days?)

I'm not sure that something better and simple
enough can be done.

I assume we are ASCII at least.

>
> lines 440, 442: You're assuming that the string is
> null terminated.
> What if the file is incorrectly
> formatted? It may be possible to read past
> the end of the allocated block. I haven't
> actually worked through the exact code path
> to trigger this condition, but it seems suspicious.
>

It is the requirement of the format, it is designed to
be loaded as is to memory and be useful.

So I don't check every possible string as it allows
to load files much faster.

In any case when I load file I put 1 extra 0 at the end
so at some point it will be terminated

> line 475: You're assuming that the separator
> actually appears.

std::string::npos is maximal value of size_t
so substr just get bigger sub string. May be
not 100% clear but this code does what it should.

>
> line 500: I think this can cause alignment problems.
> Does the file format make any alignment guarantees?
> Even if it does, you'll need to handle invalid
> files gracefully. Of course, it'll only be
> inefficient on platforms that allow unaligned
> loads, but still, the standard doesn't guarantee
> that it'll work.

Actually there two types of catalogs:

1. Directly loaded - when the locale's 8 bit charset and
   the catalog's one is the same like UTF-8.
2. In memory when the base character is not char (wchar_t or charXX_t)
   or the catalog had to be converted to other character set.

So yes, these pointers are going to be aligned.

>
> shared/mo_hash.hpp:
>
> shared/mo_lambda.cpp:
>
> line 239: There's no 'c' in tokenizer.
>

Ooops. I'll fix this.

> Everything except compile can be in the unnamed namespace.
>

Yes you are right.

> shared/mo_lambda.hpp:
>
> std/all_generater.hpp:
>
> std/codecvt.cpp:
>
> std/collate.cpp:
>
> std/converter.cpp:
>
> std/numeric.cpp:
>
> std/std_backend.cpp:
>
> std/std_backend.hpp:
>
> util/codecvt_converter.cpp:
>
> util/default_locale.cpp:
>
> util/iconv.hpp:
>
> Is this header intended to be used in multiple
> translation units? If so you probably should
> make the functions inline and not use an
> unnamed namespace.
Yes you are right. I don't know why had I done it
this way.

>
> util/info.hpp:
>
> util/locale_data.hpp:
>
> util/locale_data.cpp:
>
> util/numeric.cpp:
>
> util/timezone.hpp:
>
> win32/all_generator.hpp:
>
> win32/api.hpp:
>
> line 49: Don't most Windows functions use DWORD,
> rather than int?
>

Yes. You are right.

> win32/collate.hpp:
>
> line 48: I think I just saw code looking just
> like this in another file.
>

There is no win32/collate.hpp?

> win32/converter.cpp:
>
> win32/lcid.cpp:
>
> line 51: Check whether this succeeds and uses
> the entire string?

Ok

>
> line 89: I thought double checked locking was
> considered unsafe?

Not really

I had a discussion with a professor an expert in
this area, no problems when mutexes are full
memory barriers and they are.

> Anyway you should use
> call_once for initialization like this.
>
> win32/lcid.hpp:
>
> win32/numeric.cpp:
>
> line 32: I think I saw another copy of these functions.
> Get rid of the duplication?

AFAIR, they had different semantics.

>
> win32/win_backend.cpp:
>
> win32/win_backend.hpp:
>
> build/Jamfile.v2:
>
> line 30: I don't think you need a separate
> object file. It isn't used anywhere else.

It does because it is used in different targets
AFAIR is solved this problem.

>
> line 126: A simpler way to reference another
> library is /boost//thread

Ok

>
> line 211: I would tend to use
> icu-sources = [ glob icu/*.cpp ] ;
> result += <source>$(icu-sources)
> I see you're using something like
> this later on.
>

The problem comes when you want to put something
temporary there out of project.

Never liked glob

> line 278: Boost.Thread ought to set this flag
> automatically. I'm assuming from the fact
> that you have it, that it doesn't.
> Looks
> like a bug in the thread Jamfile. (Similarly,
> you should add <define>BOOST_LOCALE_NO_LIB
> in the usage requirements of boost_locale)

I explain, I use boost.thread's mutexes but I don't
want to link with this library. So in many cases
I use header-only boost.thread so I prefer specify
explicitly that I don't need Boost.Thread

(Boost.Thread is linked only for ICU backend where TLS is needed)

>
> test/Jamfile.v2: You don't actually need
> test-suite. test-suite is exactly the
> same as alias.
>

What do you mean?

> In Christ,
> Steven Watanabe

Thank you very much, it was really very
good code review.

Artyom


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk