Problem in normalize_path
Hello. There is a new bug in &Dancer::FileUtils::normalize_path at 1.3072 :( Sample test: use Dancer::FileUtils 'path'; use Test::More 'no_plan'; is path('a/../b') => 'b'; is path('a/b/../../c') => 'c'; is path('a/b/c/../../../d') => 'd'; Output: ok 1 ok 2 not ok 3 # Failed test at t/q line 8. # got: 'a/b/d' # expected: 'd' There is wrong usage of regex with /g modifier. Simple patch is below: --- Dancer/FileUtils.pm.orig 2011-09-06 01:11:42.000000000 +0400 +++ Dancer/FileUtils.pm 2011-09-06 01:11:49.000000000 +0400 @@ -102,8 +102,7 @@ }x; $path =~ s{/\./}{/}g; - $path =~ s{$seqregex}{}g; - $path =~ s{$seqregex}{}g; + while ( $path =~ s{$seqregex}{} ) {} return $path; } But, imho, it's a bad idea to do any kind of 'universal path normalization'. What's the point? Only desire for aesthetics? Underlying FS is able to do it more reliable any case. POD on &File::Spec::canonpath talks us: "Note that this does *not* collapse x/../y sections into y. This is by design. If /foo on your system is a symlink to /bar/baz, then /foo/../quux is actually /bar/quux, not /quux as a naive ../−removal would give you. If you want to do this kind of processing, you probably want "Cwd"’s "realpath()" function to actually traverse the filesystem cleaning up paths like this." How do you suppose to handle hierarchy with symlinks? And &Cwd::realpath works like (3) realpath: "All components of file_name must exist when realpath() is called." Perhaps the best way is to drop path normalization code from Dancer::FileUtils completely? -- Cheers, Oleg A. Mamontov mailto: oleg@mamontov.net jabber: lonerr@charla.mamontov.net icq uin: 79-521-617 cell: +7-903-798-1352
On Tue, Sep 6, 2011 at 12:30 AM, Oleg A. Mamontov <oleg@mamontov.net> wrote:
Hello.
Hey Oleg,
There is a new bug in &Dancer::FileUtils::normalize_path at 1.3072 :(
thanks for the report and patch!
Sample test: use Dancer::FileUtils 'path'; use Test::More 'no_plan';
is path('a/../b') => 'b'; is path('a/b/../../c') => 'c'; is path('a/b/c/../../../d') => 'd';
I've added this to our normalize path test.
There is wrong usage of regex with /g modifier. Simple patch is below:
--- Dancer/FileUtils.pm.orig 2011-09-06 01:11:42.000000000 +0400 +++ Dancer/FileUtils.pm 2011-09-06 01:11:49.000000000 +0400 @@ -102,8 +102,7 @@ }x;
$path =~ s{/\./}{/}g; - $path =~ s{$seqregex}{}g; - $path =~ s{$seqregex}{}g; + while ( $path =~ s{$seqregex}{} ) {}
return $path; }
Patch was merged! Thanks!
But, imho, it's a bad idea to do any kind of 'universal path normalization'. What's the point? Only desire for aesthetics?
Aesthetics is important. :) And &Cwd::realpath works like (3) realpath:
"All components of file_name must exist when realpath() is called."
If you look at the history of FileUtils.pm, You'll see that we've used realpath() originally. The problem was the interoperability with Windows and Cygwin. realpath() is originally a BSD C function and that implementation exists on Linux and BSD, that have C compilers. On Windows (or where you do not have the XS version available) it will use a Pure-Perl implementation that dies on errors instead of returning empty. We've had wrappers on those functions and then noticed more interesting possible path types (UNC) in Windows and Cygwin which added to even more logics. It ended up as long and disgusting code that was hard to follow, use, understand and refactor. We've just simplified the operation.
Perhaps the best way is to drop path normalization code from Dancer::FileUtils completely?
Perhaps that is the best solution. We should get Sukria's attention on this. There is still path_no_verify which can be used, and indeed we try to separate it to use path_no_verify internally where symlinks might be used, and path() elsewhere. Perhaps it would be better to remove them entirely. Have a great day, and thanks for the contribution! Sawyer.
participants (2)
-
Oleg A. Mamontov -
sawyer x