Boost logo

Boost-Commit :

From: jurko.gospodnetic_at_[hidden]
Date: 2008-05-05 18:37:20


Author: jurko
Date: 2008-05-05 18:37:19 EDT (Mon, 05 May 2008)
New Revision: 45158
URL: http://svn.boost.org/trac/boost/changeset/45158

Log:
  Patch for the NORMALIZE_PATH builtin Boost Jam rule as well as an appropriate update for the path.jam Boost Build module where that rule was being used to implement path.join and related operations.

  As it was written before the rule had 'random' behavior in some borderline cases such as: not passing it a parameter, passing it a folder whose path starts with one or two backslashes (as opposed to slashes) or passing it an invalid rooted path with enough '..' path elements to take it 'before the root path'. In those cases it would cause an access violation, 'incorrectly' un-root the path (i.e. remove the leading slash) or simply remove a 'random' path modification respectively. Also the rule is now more tiny bit more efficient and much better documented.

  Invalid rooted paths with enough '..' path elements to take them 'before the root path' are now recognized and an empty list is returned.

  Due to this rule having such 'messy' behavior the path.join rule and its user make-NT rule had some twisted logic in them to work around all the problems this caused. This patch invalidates the logic in question and replaces it with a much simpler one (detailed comments added).

  Other NORMALIZE_PATH callers should not be affected since both the old and the new version work the same on 'regular' paths (i.e. those not mentioned above).

  The new functionality for recognizing Boost Jam versions has been used to make Boost Build scripts use the old path functionality when using Boost Jam older than 3.1.17 and use the new functionality otherwise. As consequence, now anyone using the trunk version of Boost Build and an older 3.1.17 version of Boost.Jam will need to recompile their Boost Jam executable.

  The patch does not cause any Boost Build or Boost Jam tests to fail.

  Added a related NORMALIZE_PATH Boost Jam test. Note that this test causes Boost Jam versions built prior to this patch to crash (access violation).

  Added additional internal Boost Build tests for the path.jam module testing how it handles some invalid Windows paths.

Added:
   trunk/tools/jam/test/builtin_normalize_path.jam (contents, props changed)
Text files modified:
   trunk/tools/build/v2/util/path.jam | 85 ++++++++++++++++++++++-----
   trunk/tools/jam/src/builtins.c | 120 ++++++++++++++++++++++-----------------
   trunk/tools/jam/test/test.jam | 1
   3 files changed, 137 insertions(+), 69 deletions(-)

Modified: trunk/tools/build/v2/util/path.jam
==============================================================================
--- trunk/tools/build/v2/util/path.jam (original)
+++ trunk/tools/build/v2/util/path.jam 2008-05-05 18:37:19 EDT (Mon, 05 May 2008)
@@ -22,6 +22,7 @@
 import regex ;
 import sequence ;
 import set ;
+import version ;
 
 
 os = [ modules.peek : OS ] ;
@@ -151,14 +152,7 @@
 
 
 # Concatenates the passed path elements. Generates an error if any element other
-# than the first one is rooted.
-#
-# Boost Jam has problems with unclear NORMALIZE_PATH builtin rule behavior in
-# case you pass it a leading backslash instead of a slash or in some cases when
-# you send it an empty initial path element. At least some of those cases are
-# being hit and relied upon by the path.make-NT rule. One thing this
-# implementation does not support correctly is converting a native Windows path
-# with multiple leading backslashes into the internal path format.
+# than the first one is rooted. Skips any empty or undefined path elements.
 #
 rule join ( elements + )
 {
@@ -175,13 +169,26 @@
                 errors.error only the first element may be rooted ;
             }
         }
- if ! $(elements[1]) && $(elements[2])
+ if [ version.check-jam-version 3 1 17 ]
         {
- return [ NORMALIZE_PATH "/" "$(elements[2-])" ] ;
+ return [ NORMALIZE_PATH "$(elements)" ] ;
         }
         else
         {
- return [ NORMALIZE_PATH "$(elements)" ] ;
+ # Boost Jam prior to version 3.1.17 had problems with its
+ # NORMALIZE_PATH rule in case you passed it a leading backslash
+ # instead of a slash, in some cases when you sent it an empty
+ # initial path element and possibly some others. At least some of
+ # those cases were being hit and relied upon when calling this rule
+ # from the path.make-NT rule.
+ if ! $(elements[1]) && $(elements[2])
+ {
+ return [ NORMALIZE_PATH "/" "$(elements[2-])" ] ;
+ }
+ else
+ {
+ return [ NORMALIZE_PATH "$(elements)" ] ;
+ }
         }
     }
 }
@@ -447,17 +454,29 @@
 
 
 # Converts native Windows paths into our internal canonic path representation.
+# Supports 'invalid' paths containing multiple successive path separator
+# characters.
 #
 # TODO: Check and if needed add support for Windows 'X:file' path format where
 # the file is located in the current folder on drive X.
 #
 rule make-NT ( native )
 {
- # This implementation does not support native Windows paths containing
- # multiple successive path separator characters. Some cases might work but
- # some do not. All this is due to a not so clear way NORMALIZE_PATH rule
- # works.
- local result = [ path.join [ regex.split $(native) "[/\\]" ] ] ;
+ local result ;
+
+ if [ version.check-jam-version 3 1 17 ]
+ {
+ result = [ NORMALIZE_PATH $(native) ] ;
+ }
+ else
+ {
+ # This old implementation is really fragile due to a not so clear way
+ # NORMALIZE_PATH rule worked in Boost.Jam versions prior to 3.1.17. E.g.
+ # path.join would mostly ignore empty path elements but would root the
+ # joined path in case the initial two path elements were empty or some
+ # similar accidental wierdness.
+ result = [ path.join [ regex.split $(native) "[/\\]" ] ] ;
+ }
 
     # We need to add an extra '/' in front in case this is a rooted Windows path
     # starting with a drive letter and not a path separator character since the
@@ -605,7 +624,7 @@
         if ! [ MATCH (\\.) : $(file) ]
         {
             #
- # Always add "." to end of non-extension file
+ # Always add "." to end of non-extension file.
             #
             file = $(file). ;
         }
@@ -776,6 +795,38 @@
     assert.result "/foo" : make "\\foo" ;
     assert.result "/D:/My Documents" : make "D:\\My Documents" ;
     assert.result "/c:/boost/tools/build/new/project.jam" : make "c:\\boost\\tools\\build\\test\\..\\new\\project.jam" ;
+
+ # Test processing 'invalid' paths containing multiple successive path
+ # separators.
+ assert.result "foo" : make "foo//" ;
+ assert.result "foo" : make "foo///" ;
+ assert.result "foo" : make "foo\\\\" ;
+ assert.result "foo" : make "foo\\\\\\" ;
+ assert.result "/foo" : make "//foo" ;
+ assert.result "/foo" : make "///foo" ;
+ assert.result "/foo" : make "\\\\foo" ;
+ assert.result "/foo" : make "\\\\\\foo" ;
+ assert.result "/foo" : make "\\/\\/foo" ;
+ assert.result "foo/bar" : make "foo//\\//\\\\bar//\\//\\\\\\//\\//\\\\" ;
+ assert.result "foo/bar" : make "foo//\\//\\\\bar//\\//\\\\\\//\\//\\\\" ;
+ assert.result "foo" : make "foo/bar//.." ;
+ assert.result "foo/bar" : make "foo/bar/giz//.." ;
+ assert.result "foo/giz" : make "foo//\\//\\\\bar///\\\\//\\\\////\\/..///giz\\//\\\\\\//\\//\\\\" ;
+ assert.result "../../../foo" : make "..///.//..///.//..////foo///" ;
+
+ # Test processing 'invalid' rooted paths with too many '..' path elements
+ # that would place them before the root.
+ assert.result "" : make "/.." ;
+ assert.result "" : make "/../" ;
+ assert.result "" : make "/../." ;
+ assert.result "" : make "/.././" ;
+ assert.result "" : make "/foo/../bar/giz/.././././../../." ;
+ assert.result "" : make "/foo/../bar/giz/.././././../.././" ;
+ assert.result "" : make "//foo/../bar/giz/.././././../../." ;
+ assert.result "" : make "//foo/../bar/giz/.././././../.././" ;
+ assert.result "" : make "\\\\foo/../bar/giz/.././././../../." ;
+ assert.result "" : make "\\\\foo/../bar/giz/.././././../.././" ;
+ assert.result "" : make "/..///.//..///.//..////foo///" ;
 
     assert.result "foo\\bar\\giz" : native "foo/bar/giz" ;
     assert.result "foo" : native "foo" ;

Modified: trunk/tools/jam/src/builtins.c
==============================================================================
--- trunk/tools/jam/src/builtins.c (original)
+++ trunk/tools/jam/src/builtins.c 2008-05-05 18:37:19 EDT (Mon, 05 May 2008)
@@ -1338,94 +1338,110 @@
        pass, removing all the entered '\1' characters.
     */
 
- string in[1], out[1], tmp[1];
- char* end; /* Last character of the part of string still to be processed. */
- char* current; /* Working pointer. */
- int dotdots = 0; /* Number of '..' elements seen and not processed yet. */
- int rooted = arg->string[0] == '/';
- char* result;
+ string in[1];
+ string out[1];
+ char * end; /* Last character of the part of string still to be processed. */
+ char * current; /* Working pointer. */
+ int dotdots = 0; /* Number of '..' elements seen and not processed yet. */
+ int rooted = 0;
+ char * result = 0;
+
+ /* Make a copy of input: we should not change it. Prepend a '/' before it as
+ a guard for the algorithm later on and remember whether it was originally
+ rooted or not. */
 
- /* Make a copy of input: we should not change it. */
     string_new(in);
- if (!rooted)
- string_push_back(in, '/');
- while (arg)
- {
- string_append(in, arg->string);
- arg = list_next(arg);
- if (arg)
- string_append(in, "/");
+ string_push_back(in, '/');
+ for (; arg; arg = list_next(arg) )
+ {
+ if (arg->string[0] != '\0')
+ {
+ if (in->size == 1)
+ rooted = ( (arg->string[0] == '/' ) || (arg->string[0] == '\\') );
+ else
+ string_append(in, "/");
+ string_append(in, arg->string);
+ }
     }
 
- /* Convert \ into /. On windows, paths using / and \ are equivalent,
- and we want this function to obtain canonic representation. */
+ /* Convert \ into /. On Windows, paths using / and \ are equivalent, and we
+ want this function to obtain a canonic representation. */
+
     for (current = in->value, end = in->value + in->size;
- current < end; ++current)
+ current < end; ++current)
         if (*current == '\\')
             *current = '/';
 
+ /* Now we remove any extra path elements by overwriting them with '\1'
+ characters and cound how many more unused '..' path elements there are
+ remaining. Note that each remaining path element with always starts with
+ a '/' character. */
 
- end = in->value + in->size - 1;
- current = end;
-
- for(;end >= in->value;)
+ for (end = in->value + in->size - 1; end >= in->value; )
     {
         /* Set 'current' to the next occurence of '/', which always exists. */
- for(current = end; *current != '/'; --current)
+ for (current = end; *current != '/'; --current)
             ;
 
- if (current == end && current != in->value) {
- /* Found a trailing slash. Remove it. */
- *current = '\1';
- } else if (current == end && *(current+1) == '/') {
- /* Found duplicated slash. Remove it. */
+ if (current == end)
+ {
+ /* Found a trailing or duplicate '/'. Remove it. */
             *current = '\1';
- } else if (end - current == 1 && strncmp(current, "/.", 2) == 0) {
- /* Found '/.'. Drop them all. */
+ }
+ else if (end - current == 1 && *(current+1) == '.')
+ {
+ /* Found '/.'. Remove them all. */
             *current = '\1';
             *(current+1) = '\1';
- } else if (end - current == 2 && strncmp(current, "/..", 3) == 0) {
- /* Found '/..' */
+ }
+ else if (end - current == 2 && *(current+1) == '.' && *(current+2) == '.')
+ {
+ /* Found '/..'. Remove them all. */
             *current = '\1';
             *(current+1) = '\1';
             *(current+2) = '\1';
             ++dotdots;
- } else if (dotdots) {
- char* p = current;
+ }
+ else if (dotdots)
+ {
             memset(current, '\1', end-current+1);
             --dotdots;
         }
         end = current-1;
     }
 
+ string_new(out);
 
- string_new(tmp);
- while(dotdots--)
- string_append(tmp, "/..");
- string_append(tmp, in->value);
- string_copy(in, tmp->value);
- string_free(tmp);
-
+ /* Now we know that we need to add exactly dotdots '..' path elements to the
+ front and that our string is either empty or has a '/' as its first
+ significant character. If we have any dotdots remaining then the passed
+ path must not have been rooted or else it is invalid we return an empty
+ list. */
 
- string_new(out);
- /* The resulting path is either empty or has '/' as the first significant
- element. If the original path was not rooted, we need to drop first '/'.
- If the original path was rooted, and we've got empty path, need to add '/'
- */
- if (!rooted) {
- current = strchr(in->value, '/');
- if (current)
- *current = '\1';
+ if (dotdots)
+ {
+ if (rooted) return L0;
+ do
+ string_append(out, "/..");
+ while (--dotdots);
     }
 
+ /* Now we actually remove all the path characters marked for removal. */
+
     for (current = in->value; *current; ++current)
         if (*current != '\1')
             string_push_back(out, *current);
 
+ /* Here we know that our string contains no '\1' characters and is either
+ empty or has a '/' as its initial character. If the original path was not
+ rooted and we have a non-empty path we need to drop the initial '/'. If
+ the original path was rooted and we have an empty path we need to add
+ back the '/'. */
+
+ result = newstr( out->size ? out->value + !rooted : (rooted ? "/" : "."));
 
- result = newstr(out->size ? out->value : (rooted ? "/" : "."));
- string_free(in);
     string_free(out);
+ string_free(in);
 
     return list_new(0, result);
 }

Added: trunk/tools/jam/test/builtin_normalize_path.jam
==============================================================================
--- (empty file)
+++ trunk/tools/jam/test/builtin_normalize_path.jam 2008-05-05 18:37:19 EDT (Mon, 05 May 2008)
@@ -0,0 +1,60 @@
+#~ Copyright 2008 Jurko Gospodnetic.
+#~ Distributed under the Boost Software License, Version 1.0.
+#~ (See accompanying file LICENSE_1_0.txt or http://www.boost.org/LICENSE_1_0.txt)
+
+ECHO --- Testing NORMALIZE_PATH builtin... ;
+
+assert "." : (==) : [ NORMALIZE_PATH ] ;
+assert "." : (==) : [ NORMALIZE_PATH "" ] ;
+assert "." : (==) : [ NORMALIZE_PATH "." ] ;
+assert ".." : (==) : [ NORMALIZE_PATH ".." ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "/" ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "\\" ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "//" ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "\\\\" ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "//\\\\//\\\\" ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "/." ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "/./" ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "\\\\///.///\\\\\\" ] ;
+assert "." : (==) : [ NORMALIZE_PATH "./././././." ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "/./././././." ] ;
+assert "foo" : (==) : [ NORMALIZE_PATH "foo" ] ;
+assert "foo" : (==) : [ NORMALIZE_PATH "foo/" ] ;
+assert "foo" : (==) : [ NORMALIZE_PATH "foo\\" ] ;
+assert "foo" : (==) : [ NORMALIZE_PATH "foo\\\\/////" ] ;
+assert "foo" : (==) : [ NORMALIZE_PATH "foo\\\\/////././." ] ;
+assert "foo" : (==) : [ NORMALIZE_PATH "foo\\\\/////./././" ] ;
+assert "." : (==) : [ NORMALIZE_PATH "foo/.." ] ;
+assert "." : (==) : [ NORMALIZE_PATH "foo////.." ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "///foo/\\\\/.." ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "\\\\\\foo\\//\\.." ] ;
+assert "." : (==) : [ NORMALIZE_PATH "foo/./.." ] ;
+assert "." : (==) : [ NORMALIZE_PATH "foo/././././.." ] ;
+assert "foo" : (==) : [ NORMALIZE_PATH "foo/./././bar/./././.././././baz/./././.." ] ;
+assert "/foo" : (==) : [ NORMALIZE_PATH "/foo/./././bar/./././.././././baz/./././.." ] ;
+assert "foo" : (==) : [ NORMALIZE_PATH "foo/./././bar/./././////.././././baz/./././.." ] ;
+assert "/foo" : (==) : [ NORMALIZE_PATH "/foo/./././bar/./././////.././././baz/./././.." ] ;
+assert ".." : (==) : [ NORMALIZE_PATH "./.." ] ;
+assert ".." : (==) : [ NORMALIZE_PATH "././././.." ] ;
+assert "../.." : (==) : [ NORMALIZE_PATH "../.." ] ;
+assert "../.." : (==) : [ NORMALIZE_PATH "./../.." ] ;
+assert "../.." : (==) : [ NORMALIZE_PATH "././././../.." ] ;
+assert "../.." : (==) : [ NORMALIZE_PATH "./.././././.." ] ;
+assert "../.." : (==) : [ NORMALIZE_PATH "././././.././././.." ] ;
+assert "../.." : (==) : [ NORMALIZE_PATH "..//\\\\\\//.." ] ;
+assert "../.." : (==) : [ NORMALIZE_PATH "../..\\\\/\\\\" ] ;
+assert "." : (==) : [ NORMALIZE_PATH "foo/../bar/../baz/.." ] ;
+assert "." : (==) : [ NORMALIZE_PATH "foo////..////bar////.//////.////../baz/.." ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "/foo/../bar/../baz/.." ] ;
+assert "/" : (==) : [ NORMALIZE_PATH "/foo////..////bar////.//////.////../baz/.." ] ;
+
+# Invalid rooted paths with leading dotdots.
+assert : (==) : [ NORMALIZE_PATH "/.." ] ;
+assert : (==) : [ NORMALIZE_PATH "/../" ] ;
+assert : (==) : [ NORMALIZE_PATH "//\\\\//\\\\/.." ] ;
+assert : (==) : [ NORMALIZE_PATH "\\\\//\\\\//\\.." ] ;
+assert : (==) : [ NORMALIZE_PATH "/../.." ] ;
+assert : (==) : [ NORMALIZE_PATH "/../../.." ] ;
+assert : (==) : [ NORMALIZE_PATH "/foo/bar/../baz/../../.." ] ;
+assert : (==) : [ NORMALIZE_PATH "/../for/././../././bar/././../././.." ] ;
+assert : (==) : [ NORMALIZE_PATH "/../foo/bar" ] ;

Modified: trunk/tools/jam/test/test.jam
==============================================================================
--- trunk/tools/jam/test/test.jam (original)
+++ trunk/tools/jam/test/test.jam 2008-05-05 18:37:19 EDT (Mon, 05 May 2008)
@@ -51,6 +51,7 @@
 
 include action_status.jam ;
 include actions_quietly.jam ;
+include builtin_normalize_path.jam ;
 include builtin_shell.jam ;
 include builtin_w32_getregnames.jam ;
 include option_d2.jam ;


Boost-Commit list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk