Boost logo

Boost :

Subject: Re: [boost] CGI / FastCGI library update
From: Darren Garvey (darren.garvey_at_[hidden])
Date: 2010-05-20 13:41:54


Hi Artyom,

On 20 May 2010 08:04, Artyom <artyomtnk_at_[hidden]> wrote:

> > >
> > > - Unix domain sockets - MUST
> >
> > As noted in the docs, this is a future goal. What makes
> > them such a necessity?
>
> Because they are much more efficient then TCP/IP onces,
> so if you deploy application on Unix you will want to use them.
>

I'm going to add this support soon so I can take some performance metrics.
I'd be suprised if the performance difference was huge, as TCP sockets are
pretty efficient these days.

They also have several other advantages like not colliding with other
> application when listening on same port accidentally.
>

IIUC mod_fcgid will only try to use a free port and carry on regardless if
if finds one in its working range which is in use. The library will simply
use the port (or ports) assigned to it unless you explicitly bind to a
particular port.

I've considered adding support to the library for accepting on a range of
ports but it slipped my mind! Boost.Range might be fit for this purpose, so
I'll get back on it.

> > As I see it, recompiling just to use Unix sockets instead
> > of TCP is
> > unacceptable,
>
> I agree that there is no reason to do this for FastCGI and CGI
> (even official libfcgi supports it) but if you once
> implement SCGI you will want to be able to switch between FCGI/SCGI
> without recompilation.
>

I don't think there's any foolproof way of doing this interrogation
automatically?

If it's a manual, runtime configuration option anyway, then the library user
would be able to use the sort of selecting I mentioned in my previous post.
Again, there is non-trivial overhead in supporting both SCGI and FastCGI
under-the-hood so library users shouldn't have to pay for this unless they
want it.

> > Indeed. I started support for SCGI in the past but have
> > removed those parts
> > for now since they aren't complete. Adding support is just
> > a matter of
> > fnding the time as the design of the library should support
> > an
> > implementation.
> >
>
> I fully understand this, I just suggested, not SCGI protocol is so
> simple that it can be implemented in several hours. (At least
> that what It had take for CppCMS)
>

Ok.

> Once again, don't try to leave in illusion that support of wide
> character would give you any advantage in Unicode support.
>

Oh I'm not.

> See: http://cppcms.sourceforge.net/boost_locale/html/tutorial.html#myths
>

Looks interesting. You should get this to review.

> > > Session managements I would strongly not recommend you
> > > using boost::serialization for this purpose. Performance
> > > is terrible (from my experience).
> >
> > What would you suggest as an alternative?
>
> Use key value pairs as they most popular data stored in sessions.
> And allow value be serializeable object for complex data structures.
>

Sounds like reinventing Boost.Serialization, but with fewer features. I'd
rather support things that people already know & use when there is something
available. FWIW, adding a SessionManager that uses Boost.Interprocess is on
my TODO list.

> I think supporting lexical conversion is a requirement of any CGI
> library,
> > so the as<> and pick<> data access functions are provided for this use.
> In
> > the absence of better an alternative, if these two functions were
> documented
> > with caveat about lexical_cast, would that be sufficient?
>
> You can always use std::stringstream (what lexical_cast is actually
> uses) but you must imbue std::locale to the stream like this.
>
> template<typename Number>
> Number to_number(std::string const &s)
> {
> std::stringstream ss;
> ss.imbue(std::locale::classic());
> ss.str(s);
> Number r;
> ss >> r;
> if(!ss) throw ..
> return r;
> }
>

If library users want this they can use it like so:

cgi::request req;
int first_way = req.get.as<int>("some-get-variable");
int second_way = atoi(req.get["some-get-variable"].c_str());
int third_way = to_number<int>(req.get["some-get-variable"]);
assert(first_way == second_way == third_way);

All of the request data is available in types that are implicitly
convertible to a string and have a c_str() function that returns a const
char_type*.

For example:

cgi::request req;
// You can get full access to the request data in its "native" type.
cgi::form_part& data = req.uploads["some-file"];
// All "native" request data types are implicitly convertible to a string.
// In the case of CGI, the string_type is std::string so this works too:
std::string upload_filename = req.uploads["some-file"];

Does that give you what you need?

> > Indeed, I am testing with lighttpd and nginx at the moment.
> > I have found an
> > experimental plugin for for nginx that supports "proper"
> > multiplexing too,
> > which looks promising.
> >
>
>
> Few words about multiplexing:
>
> 1. There is not a single web server that implements multiplexing
> as it much simpler and efficient to just open another socket.
>

Zeus does and so does the nginx implementation I mentioned.

> 2. More then that the only web server I have ever seen using Keep-Alive
> was cherooke (and I think IIS over pipes because pipes are not sockets)
>

What does this mean?

> 3. Even official fastcgi library do not support multiplexing.
>

I know, but that's no reason not to support it.

> 4. There is always way to tell to web server if application supports
> multiplexing or not. (on of commands of fcgi)
>

Again, this is just a workaround.

> 5. There is deep problem with multiplexing as there is no way
> to tell fastcgi application that it can't send data meanwhile
>
> For example you have two clients downloading big csv file
> of 1G one has connection slower in two times then other.
>
> So if they have multiplexed connection then either both clients
> will revive data at the lowest speed or web server will have to store
> about 0.5G in its internal buffers.
>
> So multiplexing is generally bad idea.
>

I've thought long and hard about this and I don't think your assertion is
correct. The library doesn't support restartable form parsing right now
because of the problem you're talking about. With careful design and some
clear caveats for users I believe this can be made a non-issue.

My suggestion - don't waste your time on it. It useless feature
> that theoretically could be useful but nobody uses it.
>
> Also fastcgi specifications allow you as library developer not to support
> multiplexing (actually I hadn't seen any fastcgi client that implements
> multiplexing).
>

Peter Simons' libfastcgi does support multiplexing, but isn't a complete
client library, just a protocol driver. This library "potentially" supports
multiplexing. There are a couple of things that need to change internally to
support it but as soon as I get my hands on a server I think it's doable.

> CgiCC is not compatible with the BSL so I can't use it. It
> > also does not
> > support lazy loading of requests or multiplexed FastCGI,
> > which have always
> > been goals of this library.
>
> As I mentioned before multiplexed FastCGI exists only on paper.
> Don't waste your time.
>

I assure you it's real, but I'm not going to defend it much until there are
some performance metrics supporting it.

The goal of supporting multiplexing is less about pure speed than about
resources. Having 1 connection per request means you can only support N
simultaneous requests, where N is not really that huge of a number on most
machines.

Cheers,
Darren


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk