On Tue, Sep 6, 2011 at 12:30 AM, Oleg A. Mamontov <oleg@mamontov.net> wrote:
Hello.

Hey Oleg,
 

There is a new bug in &Dancer::FileUtils::normalize_path at 1.3072 :(

thanks for the report and patch!
 

Sample test:
   use Dancer::FileUtils 'path';
   use Test::More 'no_plan';

   is path('a/../b')           => 'b';
   is path('a/b/../../c')      => 'c';
   is path('a/b/c/../../../d') => 'd';

I've added this to our normalize path test.
 
There is wrong usage of regex with /g modifier. Simple patch is below:

--- Dancer/FileUtils.pm.orig    2011-09-06 01:11:42.000000000 +0400
+++ Dancer/FileUtils.pm 2011-09-06 01:11:49.000000000 +0400
@@ -102,8 +102,7 @@
    }x;

    $path =~ s{/\./}{/}g;
-       $path =~ s{$seqregex}{}g;
-       $path =~ s{$seqregex}{}g;
+       while ( $path =~ s{$seqregex}{} ) {}

    return $path;
 }

Patch was merged! Thanks!

 
But, imho, it's a bad idea to do any kind of 'universal path normalization'.
What's the point? Only desire for aesthetics?

Aesthetics is important. :)
 

And &Cwd::realpath works like (3) realpath:
"All components of file_name must exist when realpath() is called."

If you look at the history of FileUtils.pm, You'll see that we've used realpath() originally. The problem was the interoperability with Windows and Cygwin. realpath() is originally a BSD C function and that implementation exists on Linux and BSD, that have C compilers. On Windows (or where you do not have the XS version available) it will use a Pure-Perl implementation that dies on errors instead of returning empty.

We've had wrappers on those functions and then noticed more interesting possible path types (UNC) in Windows and Cygwin which added to even more logics. It ended up as long and disgusting code that was hard to follow, use, understand and refactor.

We've just simplified the operation.
 
Perhaps the best way is to drop path normalization code from
Dancer::FileUtils completely?

Perhaps that is the best solution. We should get Sukria's attention on this.
 
There is still path_no_verify which can be used, and indeed we try to separate it to use path_no_verify internally where symlinks might be used, and path() elsewhere. Perhaps it would be better to remove them entirely.

Have a great day, and thanks for the contribution!
Sawyer.