Giter Club home page Giter Club logo

Comments (2)

hiiamboris avatar hiiamboris commented on May 26, 2024

After trying to fix it myself I eventually came to agree with @x8x that string version, while (~4x) faster, is way too messy and requires so many special cases to be considered that it's almost impossible to get it right.

On the contrary, split+join version is simple and straighforward (I removed /dir from it to isolate concerns as we have dirize for that).

Correctness IMO is the priority here. For daily running clean-path on whole filesystems one may rather write a routine.

Clean version I'm proposing:

clean-path: function [
    [no-trace] 
    "Cleans up '.' and '..' in path; returns the cleaned path"
    path [file! url! string!] 
    /only "Do not prepend current directory" 
][
	abs?: any [#"/" = first path url? path]				;-- treat all urls as absolute
	unless any [only abs?] [abs?: path: as path append what-dir path]
	out:  clear copy parts: reverse split as string! path #"/"
	root: if abs? [take/last parts]						;-- if absolute, preserve the part until the first slash
	parse parts [(up: 0) collect after out any [
		".." (up: up + 1)
	|	"."
	|	skip if (up > 0) (up: up - 1)
	|	keep skip
	]]
	either abs? [
		append out root									;-- if absolute, restore the prefix
		if single? out [insert out ""]					;-- ensure it always has a slash
	][
		append/dup out ".." up							;-- if relative, keep extra up-levels
	]
	as path join reverse out #"/"
]

With tests:

#assert [
	backed-up: :what-dir
	set 'what-dir does [copy %/level1/level2/]					;-- use predictable path for testing
	
	%/level1/level2          == clean-path      %.
	%/level1/level2/         == clean-path      %./
	%/level1/level2/         == clean-path      %""				;-- treat empty path as current directory, not root
	%""                      == clean-path/only %""
	%/level1/level2/a        == clean-path      %a
	%/level1/level2/a        == clean-path      %a/.
	%/level1/level2/a/       == clean-path      %a/./
	%/level1/level2          == clean-path      %a/..
	%/level1/level2/x        == clean-path      %a/../x
	%/level1/level2/x/       == clean-path      %a/../x/
	%/a                      == clean-path      %/a/.
	%/a/                     == clean-path      %/a/./
	%/                       == clean-path      %/a/..			;-- should this return /a/ on Windows and clean-path be platform-specific?
	%/x                      == clean-path      %/a/../x
	%/x/                     == clean-path      %/a/../x/
	%/a/b/file               == clean-path      %/a/b/./file
	%/a/file                 == clean-path      %/a/b/../file
	%/file                   == clean-path      %/a/b/../../file
	%/file                   == clean-path      %/a/b/../../../file
	%/file/                  == clean-path      %/a/b/../../../file/
	%/level1/level2/a/b/file == clean-path      %a/b/./file
	%/level1/level2/a/file   == clean-path      %a/b/../file
	%/level1/level2/file     == clean-path      %a/b/../../file
	%/level1/file            == clean-path      %a/b/../../../file
	%/file                   == clean-path      %a/b/../../../../file
	%/file                   == clean-path      %a/b/../../../../../file
	%/file                   == clean-path      %a/b/../../../../../../file
	%/file/                  == clean-path      %a/b/../../../../../../file/
	%/                       == clean-path      %a/b/../../../../../../..		;-- root path always ends with a slash
	%/                       == clean-path      %a/b/../../../../../../../
	%a/b/file                == clean-path/only %a/b/./file
	%a/file                  == clean-path/only %a/b/../file
	%file                    == clean-path/only %a/b/../../file
	%../file                 == clean-path/only %a/b/../../../file
	%../../file              == clean-path/only %a/b/../../../../file
	%../../../file           == clean-path/only %a/b/../../../../../file
	%../../../file/          == clean-path/only %a/b/../../../../../file/
	
	;; technically, // is an empty path segment, which is allowed in URLs
	;; so preserving scheme:// is not something we can or should do generally
	;; see e.g. https://en.wikipedia.org/wiki/File_URI_scheme#Number_of_slash_characters
	ftp:/                    == clean-path      ftp://..
	ftp:/                    == clean-path      ftp://../
	ftp:/                    == clean-path      ftp://../..		;-- should not consume the root slash
	ftp:/                    == clean-path      ftp://../../
	ftp://site/file          == clean-path      ftp://site/./file
	ftp://file               == clean-path      ftp://site/../file
	ftp:/file                == clean-path      ftp://site/../../file
	ftp:/file                == clean-path      ftp://site/../../../file
	ftp:/file                == clean-path      ftp://site/../../../../file
	ftp:/file/               == clean-path      ftp://site/../../../../file/
	
	%/file                   == clean-path append append/dup copy %/a/b/c/d/ %../ 16000 %file	;-- stress test (recursive algos fail it)
	
	set 'what-dir :backed-up
]

from red.

hiiamboris avatar hiiamboris commented on May 26, 2024

#5496 somewhat related

from red.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.