Boost logo

Boost Users :

From: Ben Hutchings (ben.hutchings_at_[hidden])
Date: 2003-04-16 09:10:43


Vladimir Prus wrote:
> cl_corba_at wrote:
<snip>
> >> Frankly, I've missed the WinMain case, and not sure what to do.
> >> The problem is that single string might erase important
> >> information --- what happens if you have command line argument
> >> with embedded space.
> >
> > Embedded spaces should be placed in quotation marks which should be
> > removed by the tokenizer. Arguments could be seperated by one ore
more
> > space.
>
> Do you mean that if I type
>
> program "a b c" "C:/Program files"
>
> then the program will receive this string, with quotes there?
> (The linux shell will strip quotes completely).

That is correct. Windows programs receive given a single string of
arguments to parse whereas Unix programs receive a vector which can
be passed to main unchanged by the startup code.

<snip>
> Is this solution is OK? (The only problem is that adjuacent spaces are
> mishandled, but that's fixable).

A full solution will need to be a little bit more complicated. There
isn't a specification of how Windows command-lines should be generated
or interpreted from an array of strings, but it seems to be sensible to
interpret them in the same way as Microsoft's run-time library does it:

* Outside a quoted section, a double-quote begins a quoted section.
* Outside a quoted section, a string of one or more spaces is a
  separator if there are non-space characters both before and after it;
  otherwise it's just padding.
* Inside a quoted section, a double-quote preceded by an odd number of
  backslashes (call this n) represents (n-1)/2 backslashes followed by
  a double-quote.
* Inside a quoted section, a double-quote preceded by an even number
  of backslashes (possibly zero; call this n) represents n/2 backslashes
  and ends the quoted section.
* All other characters represent themselves.

Note that double-quotes are not separators and backslashes are not
usually escape characters. Also note that a quoted section does not
have to be terminated.

tokenizer is probably not up to this job. You could probably use
regular expressions, but a custom state machine might be the best
solution.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net