Boost logo

Boost :

Subject: Re: [boost] [locale] Review
From: Mathias Gaunard (mathias.gaunard_at_[hidden])
Date: 2011-04-19 15:23:33


On 19/04/2011 20:30, Artyom wrote:

> As you know, the API is a subject to change
> in Boost, and I would say that it is
> changed toooo freely and toooo often.

Right, that's because the API is what makes Boost libraries good.
If it's not good enough, it needs to change. And to avoid APIs from
changing too much in released libraries, the best solution is to make
sure the library has the best API *when it is accepted into Boost*.

And I think there is still some room for improvement with the
Boost.Locale API.

>
> I can accept that some operations may be
> better to work on arbitrary streams but
> most of them just don't need it.
>
> For example collation... When exactly
> do you need to compare arbitrary
> text streams?

Because the data does not exist in memory, may be computed on the fly,
or whatever really.
A possible application is simply to chain operations in a pipeline, i.e.
without having to apply one operation completely, then the other on that
result, etc (and do the intermediate temporary buffer allocations).

> So this is a quick glance to the
> future that currently only on
> planning stage and is not a part
> of the review.
>
> I thought to provide stream API
> for charset conversion but put it
> on hold as it is not really a
> central part, especially when
> codecvt it there.

I believe it *is* the very central part of any text processing system.

> Such things should appear in code review.
>
> I'm really welcome to learn how to do things better.
> The question is whether such changes are critical or not.

All those comments at the end of my original message only affect the
internals, so I don't care so much.

It's mostly a style thing.

> Take a deeper look to the section.
>
> It is different from backend selection.

If I want to add a backend, I only want to add a new repository with the
implementation for that backend. I do not want to have to hack all
shared files by adding some additional ifdefs.

>>>> I'm also a bit concerned about the generalized
>>>> usage of virtual functions anywhere;
>>>> it seems unnecessary in quite a few places,
>>>> please prefer using templates when dynamic binding is not required.
>>>
>>> More specific points?
>>
>> Couldn't boost::locale::util::base_converter be a concept rather than an
>> abstract base class?
>>
>
> Because there is no need to duplicate
> a complex code via template metaprogamming
> if a simple function call can be made.

This sentence doesn't make any sense to me.

Template meta-programming is not a mean to duplicate code.
Nor is normal template usage, which is what I suggested instead of
virtual functions, template meta-programming.

> Don't overrate template metaprogramming
> and don't underestimate virtual functions.

You could remove the reference to the base class by a template parameter
and get rid of the virtualness of the member functions.

The virtualness is already in codecvt, why would you want to put it
twice in a row?

>
>>
>>>> A lot of new and vectors too, I'd prefer if ideally
>>>> the library never allocated anything.
>>>
>>> I'm sorry but it is just something
>>> that can't and would never happen.
>>>
>>> This request has no reasonable base especially
>>> for such complex topic.
>>
>> Usage of templates instead of inclusion polymorphism
>> would allow to avoid newing the object and using a smart
>> pointer, for example.
>>
>
> I'm not sure what exact location bothers use but
> anywhere (unless I miss something)
> there are minimal data copying, and I relate heavily
> on RVO.

I didn't say copying, I said allocation and usage of new.
grep -r -F "new " * should give you the exact locations.

>
> If you see some not-required copying tell me.
>
>> Plus the instances of allocation in the
>> boundary stuff (when you build an index
>> vector and when you copy into a new basic_string)
>> appears to be unnecessary.
>>
>
> More specific location? I don't remember
> such thing, I just need better pointers
> to answer.

I've been very precise. You unnecessarily allocate a new string and copy
the contents in the operator* of token_iterator.
I've also given you a trivial fix for that particular problem, though it
doesn't solve the underlying problem of generating (and allocating) the
index vector first.

I don't see how I can be more precise than this.

>>
>> Passing an arbitrary range to the functions is convenient, forcing
>> the user to copy it to a vector and pass the begin and end pointers
>> before it can call the functions isn't.
>>
>
> Where exactly, because I need better context as above.

The interface of your functions is something like
std::string foo(char* begin, char* end),

Let's say I've got a filter_range<std::vector<char> > as input.

To use your function, I need to do
std::vector<char> v;
std::copy(boost::begin(input), boost::end(input), std::back_inserter(v));
std::string s = foo(v.data(), v.data() + v.size());

A more convenient interface would directly allow
std::string s = foo(input);


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk