Boost logo

Boost :

From: Edward Diener (eddielee_at_[hidden])
Date: 2002-10-24 10:38:24


"Andrei Alexandrescu" <andrewalex_at_[hidden]> wrote in message
news:ap82ge$57d$1_at_main.gmane.org...
> "Edward Diener" <eddielee_at_[hidden]> wrote in message
> news:ap7uqk$ndn$1_at_main.gmane.org...
> > I would suggest that you ask Dr. John Maddock who wrote it further about
> its
> > design.
>
> Well it's implied that questions on this forum are directed to everybody.

Of course. It's just that the designer of the library knows it best even if
others have looked at it also.

>
> >I am not the designer although I have used it successfully in my own
> > Regular Expression Component Library (
> > http://www.tropicsoft.com/Components/RegularExpression) . Much of the
> > underlying Regex++ functionality exists in a library because
instantiating
> > such code via templates each and every time would really lead to code
> bloat.
> > At least a library allows one to fold common code together.
>
> I'm not sure I understand. What would "each and every time" mean. Wouldn't
> the linker eliminate duplicate functions, if any?

Yes, it should. I was referring to the fact that a design might include
non-templated common code which all instantiations might need to use. Having
to create templated code to support such common code does not seem to me a
good design strategy just because C++ templates are considered the thing to
do for all occasions. So common code gets put in a library. In the case of a
shared library ( DLL under Windows ) that shared library gets distributed
with the module. It's possible that this is the wrong design for a
particular situation and a fully-templated design is better but that is a
matter to be decided on for each situation.

>
> > > cvs.exe: 239 KB
> > > efaxview.exe (fax viewer): 337 KB
> > > kazaa.exe: 2,258 KB
> > > msmsgs.exe (MS Messenger): 1,477 KB
> > > stocks.exe (Medved QuoteTracker): 4,642 KB
> > > wincvs.exe: 733 KB
> > >
> > > I believe I compressed some or all of these programs with upx, but if
> you
> > > double the numbers, my point stays the same.
> > >
> > > Consider a 3 MB executable, would that be enough? If what my program
> does
> > is
> > > ***all about*** regular expressions, ok, I grudgingly agree on having
a
> > > regex engine add 17% size to it.
> > >
> > > If that program has only incidental use for regular expressions (as
> might
> > be
> > > the case with either of the above programs), such as validating some
> user
> > > input or parsing some protocol or parsing some file format, then no
way
> > > we're talking about "pretty small". No way.
> >
> > I guess I don't understand the issue regarding size per se. Unless one
is
> > programming embedded systems nowadays it practically doesn't seem to
> matter.
> > Of course I understand designing something well so that extraneous code
or
> > code duplication doesn't occur, and I enjoy as much as the next
programmer
> > designing something elegant, creative, and beautiful. But just as in a
> more
> > well known area, size matters much less than people think and creativity
> is
> > what counts.
>
> Nowadays speed depends on size because of the increasing gap between
> processor and memory and the crucial importance of cache. Second, download
> duration is a factor with many applications.

I don't agree with your first assumption. Speed and size are often
tradeoffs, where one can get something to be faster but larger or slower but
smaller. Cache is usually involved with executing code, not code which sits
in a DLL or module on the disk. While download duration is a factor in
network applications, most Internet type applications rely on the thick
servers and thin clients. For other networked applications it is often the
data transportation that is the factor and not code transportation. Most
object based network mechanisms ( DCOM, CORBA ) as well as procedural-based
ones like RPC are pretty good at just transporting the
objects/routines/parameters that are needed rather than entire modules.

>
> I have worked very recently on systems where size was a BIG issue (and
still
> is). They are commonly used desktop applications that you might have
> installed (and perhaps even running) on your machine right now. Those
> systems could have used regexps on occasion. Adding a 500 KB package for
> doing that would be risible.
>
> Indeed, size is not of crucial importance, however 500 KB today is big
> enough to be non-negligible. The regular expression engines must lose one
> order of magnitude in size t be compelling. I understand that that's
> entirely doable.

A size of 500 KB is non-negligible but given the multi-megabyte and 700K or
so modules you quote above, I think you are overreacting to size in and of
itself. Look at the size of language vendor shared libraries nowadays. They
are easily 500K on my system ( W2K ) and often into the multi-megabyte area.
Distributions with rich functionality often run to 3,4,5 megabytes if not
more. This too me is common.

I think size as a matter of inelegant and/or irrelevant code and bad design
is important, but not size as a matter in and of itself. So your perception
of Regex++ or Greta may be entirely correct that they are too big but I
think you have to look at the design and determine from that what you feel
is extraneous or code bloat and could be improved. Of course if you have a
better design, I am sure you will go for it.

>
> By the way, I think regexps are an essential addition to the standard.
It's
> practically what forces many to work in Perl rather in C++. The current
> perception is that when it comes about string manipulation and I/O, C++ is
> primitive and arcane.

The current perception is wrong but it has been largely fueled, in my
estimation, on the fact that the most used C++ compiler in the corporate
world, as well as some other highly popular C++ compilers, have problems
with advanced C++ implementations because they don't support the C++
language adequately. When that changes, as it appears it will be doing
relatively soon, then the perception will change because then the number of
people using advanced C++ libraries like your own Loki, Boost, Blitz et al.
will increase greatly and such idioms will become the mainstream.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk