Boost logo

Boost :

Subject: Re: [boost] CGI / FastCGI library update
From: Darren Garvey (darren.garvey_at_[hidden])
Date: 2010-05-19 19:28:33


Hi Artyom,

Thanks for taking a look. I'm aware of CppCMS so I appreciate the comments.

On 19 May 2010 10:14, Artyom <artyomtnk_at_[hidden]> wrote:

>
> Project
> -------
>
> Please describe what is your final target.
>

Noted.

> Protocols:
> ----------
>
> - Unix domain sockets - MUST
>

As noted in the docs, this is a future goal. What makes them such a
necessity?

> - Make request abstract class rather then concept. Unless you
> want to recompile your application to work with each type of connection.
> I you only what to change configuration if you work over Unix sockets,
> TCP sockets or you work with scgi instead of fcgi.
>
> Beleive me I've been there I know what I'm talking about.
>

As I see it, recompiling just to use Unix sockets instead of TCP is
unacceptable, but recompiling to use a different protocol is ok.

The way the library is currently designed makes the Protocol a strictly
compile-time choice. A library user can switch between protocols at runtime
using, for example:

fcgi::service service;
fcgi::acceptor acceptor(service);
if (acceptor.is_cgi())
  // handle a fcgi::request (or more than one)
else
  // handle a cgi::request

A single, templated request handler can work transparently with cgi::request
or fcgi::request so the above code is about all that is needed to support
both.

CGI and FastCGI are so different there is non-trivial overhead in supporting
both transparently. I would rather a user explicitly do something like the
above than the library impose this overhead on them... That said, I could be
convinced that there is a valid use-case for needing to support both
transparently.

As an example, FastCGI doesn't require stdio at all, so simple FastCGI
applications tend to be smaller than an equivalent CGI one due to the
overhead of pulling in <stdio>. Complex FastCGI applications will tend to be
very different to CGI ones due to the different abilities of a FastCGI,
which may well include expensive state information.

 The way asio works fine for work for general cases of implementing
> specific protocols but very bad for applications that may work
> with different sources of data
>
> So if I would be doing formal review I would say that this is no-go
> solution.
>
> - I'd recommend you also implement SCGI as it is very simple
> and very similar to fcgi in abilities.

Indeed. I started support for SCGI in the past but have removed those parts
for now since they aren't complete. Adding support is just a matter of
fnding the time as the design of the library should support an
implementation.

> - You need to handle signals: http://www.fastcgi.com/docs/faq.html#Signals
> This is how web server would shutdown your application:
>

For now users have to handle signals, although I agree that adding some
handling of signals and allowing a graceful shutdown would be useful.

Unicode
> -------
>
> - What is basic_cookie<wchar_t> ? Have you seen anything like that in RFC?
> How would you convert octets cookie to wide? What encoding? What library
> do you use for code-page conversion?
>
> Just use plain string.
>

Fair point. I'll remove this.

> Don't try to push wchar_t to your application as you'll get in deep
> troubles. Want Unicode? Use UTF-8 and stay away from "wide characters"
> they would may your life much harder and would not bring you a single
> advantage over UTF-8.
>
> I've been there too...
>

I agree that I18N support is a must, but unfortunately I'm pretty ignorant
with respect to Unicode. Every char and string in the library derives from
the traits of the protocol being used, so this is configurable throughout
the library, but I have not done any more to support them. I think
basic_char<wchar_t> is the only typedef that uses wchar_t and it isn't used
anywhere, fwiw. Presumably different algorithms are required for decoding /
encoding algorithms for wide chars too and these are not yet included.

Patches are welcome!

Acceptors
> ----------
>
> You need acceptors to be configurable to work with:
>
> - Unix domain sockets from arbitrary external socket and **stdin**
> - TCP/IP sockets from arbitrary port and **stdin**
>

The acceptors currently work on one type of connection, based on the
Protocol. Support for Unix domain sockets is pending, as mentioned in the
docs. TCP sockets are supported from an arbitrary port on linux using:

fcgi::service service;
fcgi::acceptor acceptor(service, 8008); // port 8008
// ...

This was the default behaviour on both Windows and Linux until recently,
when I swapped out TCP for anonymous pipes on Windows to make getting
started easier. The above is still possible on windows, but only by defining
a custom protocol that uses TCP, which is a bit OTT. I will add an example
to the documentation that shows how to do this for now, but supporting
multiple transport types would be nicer.

Cookies:
> --------
>
> - Use max age rather then expiration date, it is much more reliable
> accorss various system with unsichronized clocks.
> - You must not URL-Decode cookies (see apropriate RFC)
>

I've found conflicting resources on this, eg.
http://cephas.net/blog/2005/04/01/asp-java-cookies-and-urlencode/

Rereading the RFC, there is no mention of decoding, so I'll stop that by
default (but might make it configurable at compile-time).

> - You must parse cookies inside quotes as well.
> Cookie as foo="שלום" is actually valid cookie.
>

Interesting, I wasn't aware this was valid for receiving cookies on a
server. Do you have a reference for this?

> Sessions:
> --------
>
> - Not thread safe. The code you written is no-go. You need to do
> some hard work to make it safe.
>

The default session support is very basic. The session support has been
specifically included in a configurable way so a library user needs only
define a session type (the base class for session) and a session manager
type, which provides three public functions in order to work. The idea is
that real-world apps might want to plug in their own database libraries in
place of the built-in (optional) session support.

This needs to be documented, added to my TODO.

- Session managements I would strongly not recommend you
> using boost::serialization for this purpose. Performance
> is terrible (from my experience).
>

What would you suggest as an alternative?

> But this is up to you.
>
> General Notes:
> --------------
>
> - Don't use boost::lexical_cast for conversion between numbers and
> string - you may be surprised what happens with it
> when you start using localization... (bad things)
>

I think supporting lexical conversion is a requirement of any CGI library,
so the as<> and pick<> data access functions are provided for this use. In
the absence of better an alternative, if these two functions were documented
with caveat about lexical_cast, would that be sufficient?

> Web Servers:
> ------------
>
> - Test with more then one web server: test with lighttpd and nginx
> as you find some things different.
>

Indeed, I am testing with lighttpd and nginx at the moment. I have found an
experimental plugin for for nginx that supports "proper" multiplexing too,
which looks promising.

- I can say from the begging you will have problem with IIS. I don't
> think IIS FastCGI connector is good enough today. You will probably
> need to work over pipes.
>

FastCGI works with IIS on Windows as anonymous pipes are used on that
platform. I see the part of the documentation that is misleading you so
I'lll fix that. The FastCGI connector in IIS certainly seems tailored to PHP
use as it doesn't reuse connections particularly well. This may be an issue
with my configuration though.

Now few additional unrelated points:
> ------------------------------------
>
> Take a look on two following projects:
>
> - CppCMS (the web framework I develop)
> http://art-blog.no-ip.info/wikipp/en/page/main
>
> The wiki is build on CppCMS.
>

Your library is higher-level than the one I am proposing and somewhat
orthogonal. This library aims to provide a means of developing several
different high-level web development frameworks.

> - CgiCC is a cgi library (that supports fcgi as well)
> and supports about almost everything you have in your code.
> (maybe no sessions, but I don't think your session implementation
> is not safe at least at this point)
>

CgiCC is not compatible with the BSL so I can't use it. It also does not
support lazy loading of requests or multiplexed FastCGI, which have always
been goals of this library.

Thanks for the feedback.

Cheers,
Darren


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk