Comments (6)
I found at least one of the reasons going through the source with the call stack as hint, the evaluation of the permissions "executable" part is quite expensive and most of the time permissions of a directory_entry are not used.
I think I'll defer the X-Part of the flags until permissions are actually used, and additionally try to optimize their evaluation.
I'm on a trip right now (browsing code on the phone), but I think a test of that should be on a branch sometime tomorrow. Not sure about the impact yet.
from filesystem.
My work is on branch: feature-73-performance-optimization
Okay, let me start by saying, I was not able to reproduce the huge gap from your measurements. My tests where on my Dev-Laptop, with an SSD (I have no Windows system with HDD available), and I'm using VisualStudio 2019, 16.7.4 and my baseline measurements are:
Recursive iteration over C:\Windows
(nested directories, 324681 entries, 247597 files) with additional fs::status()
call, as in the given code:
Implementation | Time | Relative |
---|---|---|
std::filesystem |
47.2s | 100% |
ghc::filesystem |
51.5s | +9% |
ghc::filesystem with reduced path overhead |
49.7s | +5% |
Not that impressive, still useful, with the difference of internal implementation in mind. I want to point out, that the additional fs::status(entry_path)
call is the mail culprit of time taken. The fs::directory_entry
from the iterator already has a status, so using fs::file_status entry_status = entry_path.status()
inside the loop instead leads to:
Implementation | Time | Relative |
---|---|---|
std::filesystem |
10.5s | 100% |
ghc::filesystem |
13.1s | +25% |
ghc::filesystem with reduced path overhead |
11.8s | +12% |
So my optimizations halved the overhead of the different internal representations, I'm happy with that.
I then also created a single test directory with 20k files, to have a comparison without the impact of many directory_iterator
creations during the work of the recursive_directory_iterator
used on the C:\Windows
folder:
Implementation | Time |
---|---|
std::filesystem with fs::status(entry_path) |
1480ms |
ghc::filesystem with fs::status(entry_path) |
1570ms |
ghc::filesystem with fs::status(entry_path) and path optimizations |
1500ms |
std::filesystem with entry_path.status() |
47ms |
ghc::filesystem with entry_path.status() |
66ms |
ghc::filesystem with entry_path.status() and path optimizations |
44ms |
So in the best case, no additional status call, just flat iteration, the optimizations make it faster than std::filesystem
on average. I guess there is some non-optimal code in there as well, as it should be faster with native storage of the path.
I hope this helps your performance issues as well, even if I couldn't replicates your numbers.
from filesystem.
Yeah, that is more than I expected. There is an overhead in ghc::filesystem::path
on Windows because of it using the generic representation as internal representation instead of the native one, but I guess this might be more the result of the directory_iterator
/ directory_entry
workings.
I'll do some tests and try to optimize it. I'm quite busy currently, but I plan to work on this and #70, the other Windows issue the next days, hopefully tomorrow.
from filesystem.
Sorry, there is some delay with the availability of a branch on this.
My test was with recursive_directory_iterator
as I had no large enough single directory, and there where differences between the results of iterating a huge tree with ghc::filesystem
and MS std::filesystem
in the number of regular files and the sum of their sizes, so I took quite some time to analyze this to find a possible hidden bug, and it seems I found an issue in the std::filesystem
implementation that I'm going to report over there.
I hope to give it another shot after work today, but I wanted to report back why there is no branch yet.
from filesystem.
ok, thanks for information! I'm ready for test
from filesystem.
from filesystem.
Related Issues (20)
- More Code convertion support HOT 2
- Problems with UTIME_OMIT on mac_osx HOT 4
- Wrong result from stem/filename/extension when the ':' character is present in the filename HOT 2
- Warning when compiling with Microsoft compiler HOT 1
- Warning when compiling with Microsoft compiler HOT 2
- GHC_NO_DIRENT_D_TYPE should be defined on Haiku HOT 6
- GHC_FILESYSTEM_WITH_INSTALL always on HOT 2
- Planning for a new release HOT 2
- Consider migrating to Catch2 v3 HOT 2
- path::extension() of parent directory HOT 1
- macOS 10.12 cmake build fail HOT 2
- qnx 700 compile fail HOT 1
- wrong result from path::lexically_normal() for device UNC path HOT 1
- canonical does not support device UNC path HOT 1
- recursive_directory_iterator for error symlink HOT 1
- Use ghc::filesystem as a substitute for std::filesystem transparently? HOT 1
- path::string() Behavior on Win64 Inconsistent with std::filesystem HOT 2
- Please release a new version HOT 2
- clang-tidy-18 reports `The value 'XYZ' provided to the cast expression is not in the valid range of values for 'copy_options'` HOT 2
- Compile-Errors with gcc 4.7.2 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from filesystem.