Comments (3)
Your upload might have the time to finish before I'm done implementing this! ;)
Multi-threading in Python usually gives questionable results because of the GIL, but in this case, I think most of the time is spent waiting for SmugMug to reply, so there could be a large gain with using multi-threading. I'll have a look, it could be fun to implement.
from smugcli.
The sync command now runs multi-threaded! I'm sorry that took a while to implement, I needed to completely re-write the the sync algorithm and change lots of surrounding code, but before I could do that, I had to write thorough unit testing to make sure I wouldn't break anything.
Let me know how you like it, whether it runs faster for you or if you find any issues.
Because files are now processed in parallel, providing meaningful feedback on the command-line provided it's own challenge. I implemented a terminal-based floating text rendering (where each status update overwrite the previous printout). This allows users to follow the progress in real time without cluttering the terminal output. I tested that on Windows, Cygwin, Linux and Mac and it all worked fine (you need to pip install colorama
for this to work on Windows). Let me know if that works for you.
I implemented three level of parallelism: folder, file and upload. You can override the default parallelism by doing:
smugcli.py sync ... --folder_threads=4 --file_threads=16 --upload_threads=3
When you are happy with a certain config, you can save it to be used as default next time by doing:
smugcli.py sync --set_defaults --folder_threads=4 --file_threads=16 --upload_threads=3
The --folder_threads
parameter specifies the number of folders to process in parallel. --file_threads
corresponds to the number of files to read in parallel and compare to the server side version. If files need to be uploaded, --upload_threads
control the number of parallel files being uploaded to the SmugMug severs. Keep in mind that when using a large --file_threads
, more files will have to be stored in memory simultaneously. If you upload many large video files, this can sum up to a very large amount of RAM. Let me know if that's a problem.
Because many threads are now working in parallel on the same folder hierarchy, I had to implement caching to make sure that each threads won't fetch the same nodes multiple times from the server. I then had to implement a garbage collection policy to keep SmugCLI from storing the whole SmugMug folder hierarchy in cache (in your case, the JSON metadata for 6 million photos and their parent folders). I tested this on my photo library, but it's a fairly small dataset. Keeping an eye on memory usage, let me know how this works out with your image collection.
from smugcli.
Sorry for the late reply. I am not on Github very often recently.
Just updated my local clone of your repo this morning and WOW!! It has been faster definitely.
I will watch memory usage and let you know if there are any problems.
Have been using smugcli
continuously for three months and so far it has uploaded 4 million files successfully. Thanks so much for this awesome tool!
from smugcli.
Related Issues (20)
- Folder path HOT 1
- Would it be difficult to add a 'download' and/or a 'syncdown' option? HOT 4
- Problems with Albums that have a "/" in the name HOT 1
- Would it be possible to add keywords when uploading? HOT 2
- Ignore only works after adding self argument to ignore_or_include HOT 1
- login error : AttributeError: 'Thread' object has no attribute 'isAlive' HOT 2
- Missing file smugcli/version.py HOT 1
- Can't create folders or albums: "User has not granted the required permissions" HOT 2
- Video file names are altered during upload HOT 4
- Sync deletions & renamed files/folders HOT 1
- Suggest adding additional video files supported by Smugmug
- Video sync doesn't always work with modification times HOT 1
- submit to smugmug third-party uploaders HOT 1
- Quiet mode ? HOT 2
- Raw file support for sync operations
- Album creation error
- Bug: Folder with both images and subfolders HOT 1
- Improvement: SmugMug limits
- Syncing a large folder: Reliability and logging
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from smugcli.