Boost logo

Boost :

Subject: Re: [boost] GSoC Proposal Preparation For Encoding Awared String
From: Soares Chen (crf_at_[hidden])
Date: 2011-03-23 23:48:19


Hi Andrew,

> Welcome!
>
> This is quite possibly the most comprehensive summary of a Boost
> discussion by a prospective GSoC student I've ever seen. Be sure to
> include this in your proposal as part of the background research (or a
> summary thereof).

Thanks for the praise! But I think you can probably take that back now
as there are several other excellent proposals posted since GSoC's
mentor organization list is announced. :)

> I think that this project may end up being a minefield. Everybody has
> their favorite string characteristics and whatever you string you
> eventually implement, will eventually fail somebody's requirements :)

Yes indeed this can be quite a dangerous project. Hopefully it will be
better with the new proposal that I've posted just now.

> Was there any consensus on why std::string could or could not be
> parameterized with UTF-specific character traits? That seems, on the
> surface, like a possible solution. I was following the discussions but
> not as closely as others :)

I think the the problem is that character traits operates on code
unit, but code unit != code point. Also because UTF-8 code point has
variable length, it makes no sense to compare for equality between two
code units. Character traits cannot also prevent invalid code point
from getting into the string.

> I think that one thing you should consider in your proposal is how you
> actually want to use your library. Consider trying to design your
> interface from a user perspective. I think that there is sometimes a
> tendency to focus on the technical aspects of a library, and It's easy
> to forget the end goal of writing a library: so somebody else can use
> it.

Thanks for pointing that out. Yes I will include different use cases
and test cases in the project to make sure that the code can fulfill
real world needs. I think that I might also fork some Boost projects
and do some minor modification on their APIs to show the benefits of
using my unicode_string_adapter over plain old std::string.

> Please don't let a lack of communication dissuade you from submitting
> a proposal. This list can be high traffic and its easy to miss good
> posts.

Thanks for the tips. I think I'll try to make my posts shorter next
time to allow more people to have time to read and reply to me. :)


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk