Boost logo

Boost :

From: Ben Hutchings (ben.hutchings_at_[hidden])
Date: 2004-10-04 08:55:37


Aaron W. LaFramboise wrote:
> David Abrahams wrote:
>
>>"Tony Juricic" <tonygeek_at_[hidden]> writes:
>>
>>>Even MS IDE has problems in some cases (very inconsistent) when space
>>>characters are present but I thought that in most Wintel and also UNIX
>>>command tools these issues are easily solved with adding quotes?
>>
>>You'd be surprised how much complication that adds for some people.
>
>
> I'd just like to point out that in the command-line Windows world,
> spaces in a single argument of any kind, including filenames, are doomed
> to fail eventually, because native Windows command lines are simply a
> single linear string of characters. There is no array of strings as in
> Unix, no matter what argv says. There is also no particular standard or
> agreement on how one might group arguments or indicate that several
> whitespace-delimited tokens should b considered as one, although various
> quote characters sometimes work.

There is a standard which is implemented by CommandLineToArgvW (Windows
NT only) and by MSVCRT. This is:

- spaces are treated as literal and not as argument separators if they
   appear between consecutive non-literal double-quotes
- double-quotes are treated as literal if they are prefixed by an odd
   number of backslashes
- backslashes are generally treated as literal, but in a sequence of
   backslashes immediately preceding a double-quote only every second
   backslash is literal

So when generating a command-line you can do something like:

string argv_to_line(const vector<string> & argv)
{
     string line;
     for (vector<string>::const_iterator it = argv.begin(),
                                         end = argv.end();
          it != end;
          ++it) {
         if (!it->empty() && it->find_first_of(" \"") == string::npos)
             line.append(*it);
         else {
             line.append(1, '"');
             size_t backslashes = 0;
             for (size_t i = 0; i != it->size(); ++i)
                 if ((*it)[i] == '\\')
                     ++backslashes;
                 else if ((*it)[i] == '"') {
                     line.append(backslashes * 2 + 1, '\\');
                     line.append(1, '"');
                     backslashes = 0;
                 } else {
                     if (backslashes) line.append(backslashes, '\\');
                     line.append(1, (*it)[i]);
                     backslashes = 0;
                 }
             if (backslashes) line.append(backslashes * 2, '\\');
             line.append(1, '"');
         }
         line.append(1, ' ');
     }
     return line;
}

Examples:

Unquoted Quoted
--------------------------
abc def "abc def"
abc \def "abc \def"
abc def" "abc def\""
abc def\ "abc def\\"
abc def\" "abc def\\\""
abc\\ def "abc\\ def"
abc def\\ "abc def\\\\"

> Writers of command line programs can work around this on a case-by-case
> basis, such as adopting the convention to use the double-quote as a
> grouping character. However, in nontrivial systems where command line
> programs are passing data to other command line programs, it is the
> usual case that some component or other inappropriately quotes for a
> subprogram, or forgets to quote filenames at all.

Yes, this is true. In fact the same goes for Unix if there are any
shell scripts involved, since variables with spaces in magically split
themselves into multiple arguments if you forget to quote them.

> For this reason, spaces should never be used in filenames of projects
> intended to be robust on Windows. On the other hand, writers of command
> line programs that accept filenames as a parameter should always test
> their programs to ensure that filenames with spaces will be accepted
> correctly.

I would have thought that was the responsibility of those writing
startup code.

Ben.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk