Subject: Re: [boost] GIL io_new review
From: Domagoj Saric (domagoj.saric_at_[hidden])
Date: 2010-12-14 11:06:18
"Lubomir Bourdev" <lbourdev_at_[hidden]> wrote in message
Let me for now try to at least clear some confusion/misunderstanding...
First of all, I don't believe your tests represent real use scenarios. If
you loop over the same image, the image gets into the OS cache and therefore
the I/O access bottleneck is no longer there. It is not typically the case
that the image is in the OS cache before you start reading it. A better test
case is to have many images and do a single loop over them. So I cannot
trust the 30% number that you report. My prediction is that if you do it for
fresh images the differences will be far smaller than 30%.
Yes, the test I used is purely synthetic and hardly represents any real use
case. However it is good for measuring/pointing at the overall
overhead/'fat' that each of the libraries induces which is important to
consider for any library (in the light of the 'global picture argument' I
Regardless of the above, the argument of 'shadowing by expensive
IO/OS/backend calls/operations' is still fallacious as previously
ps. again the number was ~37%...the 30% was for the code size
overhead/difference (for which it is also worthwhile to note that it
included the mutual 'starting' size of the CRT plus the two backend
libs...if those were excluded the difference would probably be an
order of magnitude larger)...
The second point is that if the code has never been profiled, there are
usually lots of easy ways to speed it up without major changes.
Of course, but
- why wait for a profiler to tell me that a needless string or vector is,
- it might turn out that for some 'repairs' major changes (maybe even to
the interface) will be required after all...
Plus, maybe not in theory but yes in practice, it is better to catch,
discuss and correct these issues at review-time (even if some of them could
be corrected later without breaking changes) because, in a review process,
we have a critical mass of people which can create the significant incentive
required to actually make the changes...
Take for example these very objections that we are discussing now...little
heed was given to them months ago when I first brought them up but now it is
obviously a different story ;)
> These are some strong statements here, deserving a separate thread that I
> am sure many Boosters will have a lot to say about. Here are my two cents:
> 1. Using STL, or templates in general does not always lead to template
> bloat. It simply makes it easier to generate bloated code, sometimes
> inadvertently, sometimes on purpose.
Neither did I claim that it _always_ leads...
To clear any confusion, at least when templates are concerned, in no way do
I think of templates in any 'bad' way...ask my boss...I maximally torture
our compilers and make him unhappy on a daily basis with 'extra extra' long
build times ;)
I was speaking of a 'school of coding' that fosters _injudicious_ use of
certain STL constructs (as if they were 'free') and placed that in the
io_new (and io) which sometimes uses STL containers (and redundant
copying) where it is not necessary (non-virtual targets) as well as
templates where these are not necessary (again, for non-virtual targets)...
considering that this redundant usage is combined (the STL containers
and copying is used precisely in the mentioned template functions) and
considering that the STL containers (just like any unwindable objects)
create EH states (and thus hidden unwind funclets) for each of the
instantiated function templates one does not need a profiler to see that
this will add some 'cholesterol' to your binary...
This is also one example that shows that the 'nineties'/'new-throw-virtual'
style of C++ coding (out of which STL came) does not mix so well with more
modern approaches (characterized more by keywords like 'template', 'meta'
> 2. Template bloat does not always mean that the code will run slowly. In
> fact, it often means the code will run faster because certain choices are
> done at compile time instead of at run time.
Of course, but then this does not actually constitute bloat (as making a
choice at compile time will actually remove the code path(s) not
taken)...unless of course this 'choice' is irrelevant and/or small compared
to the rest of the templated function body (which then gets
template-replicated causing bloat)...which would be an anti-pattern or an
example of injudicious use of templates...which I sometimes found to be the
case in io and io_new (not using a non-template function for non-virtual
> 3. STL is a very well designed library. That said, there are some STL
> implementations that are really very bad, so you have to be careful which
> one you use. In addition, it is very easy to shoot yourself in the foot
> with the STL.
A few things might speak against giving the 'very well' mark too easily,
e.g. the required improvements and refinements that it got (or even still
has not got) for later (re)discovered deficiencies many/most of which were
first provided by boost (e.g. array, intrusive and ptr containers,
And this is what Christian has done. If your input is not a stream, the code
doesn't use streams; it operates straight to FILE*. But if you start with a
stream, then it uses a stream. Did I understand the code correctly,
Christian? If so, does this address your objection then, Domagoj?
The 'forced streams' issue exists with in-memory images not with other
-- "What Huxley teaches is that in the age of advanced technology, spiritual devastation is more likely to come from an enemy with a smiling face than from one whose countenance exudes suspicion and hate." Neil Postman
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk