Comments (11)
I have some code for doing parallel uploads to S3 in an internal repo and will see if I can port it into deb-s3.
from deb-s3.
Thanks for the response! If you're talking about parallel uploads within deb-s3 that'd be cool, although not quite what I meant by this issue. The use-case I'm hitting is running multiple instances of deb-s3 --bucket foo --codename bar
simultaneously on many boxes.
Parallelizing the S3 transfers would help to a point, but would put more work into the single-track than the solution I'm experimenting with. As long as the parallel-safe /pool transfers are lumped in with the unsafe /dists alterations, I'm not sure it could be as fast as splitting them apart. Totally understand if that's more complexity as you want to incorporate though. I don't mind maintaining our own fork.
Regarding parallel uploads, I can say that I saw a 90% speedup after hooking up https://github.com/grosser/parallel to verify
with 20 threads. According to this article, S3 bandwidth should scale up with upload
nicely as well, so that'd be a nice move too.
from deb-s3.
Simultaneous execution from multiple hosts is completely different and very easy to do wrong. It is a hard problem to actually do it write, because multiple hosts will be writing to the same location on a remote API. You'd be looking at distributed locking/versioning and ensuring two don't make the call at the same time is hard. I don't plan on trying to enable that because it wasn't a primary use case. You can whip it up for controlled usage, but it is hard to actually get it right. Even with the linked commit with the path, it requires more work to do it across many physical systems vs many processes on a single system.
from deb-s3.
Oh yeah, totally agree. Didn't even occur to me to have deb-s3 do locking. That'd be crazy hard with S3's eventual consistency.
I just split the parts of deb-s3 that needed to be locked, from the parts that didn't need to be locked and was happy to let Jenkins manage the mutexing since it does that very easily.
from deb-s3.
@htmldoug #59 can helping you to implement this.
I have a function upload_package_s3
with a each
.
I don't know how to parallel in ruby, but may be easy to do.
from deb-s3.
If implemented, I think it would make sense through a plugin/extension, so maybe adding a way to load a ~/.deb-s3.rb file or something along those lines. Then you could inject something that used like redis or memcache to act as a distributed lock.
from deb-s3.
@krobertson
There is no need to do this with my work.
As I act on manifest / release / packages separately, When I upload packages, I do this in only 1 same part.
So this part can be parallel with option.
from deb-s3.
@krobertson Adding any kind of locking to deb-s3 seems like a bad idea. My unsolicited advice is that it's better to let this be done externally so users don't ask "does a plugin exist to support my external locking system." I'd have deb-s3 do one thing and do it well.
To that end, all that's needed to support external locking strategies would be to allow deb-s3 to also do "half a thing" and do it well. Namely upload to /pool in one run, and upload to /dists in a second.
My fork has been doing exactly that on our prod application build for a week and it's gone from one 10-minute deb-s3 call to a 1 minute 30 second (multiple-host-safe) /pool upload and a 1 minute (globally synchronized) /dist upload.
For us, that's the difference between one big deb-s3 upload
...
... and split up deb-s3 upload *pool only*
and deb-s3 upload *dist only*
calls...
I'm not sure how to explain any clearer.
Are you interested in incorporating an approach like this? If so, I'd be happy to clean up my code and contribute a PR for your review. Otherwise, I'll stop trying to explain and leave you to whatever approach you think is best.
Either way, I'd be happy to buy you a beer next time I'm in SF or you're in DC for contributing this to the open source community.
@guilhem Check out parallel. Very simple and great docs. Here's how we're using it. Seeing huge gains.
Cheers!
from deb-s3.
I'll take another look at it this weekend... my plugin was suggestion was mostly around loading a ruby file and letting some monkey patch, if they really wanted to. But agree is not necessarily ideal.
from deb-s3.
I've added locking support which should make the multi-host upload more safe. It doesn't do anything in parallel (ruby is really bad in doing things in parallel anyway unless you use something like active_job or resque) See pull request #89
from deb-s3.
BTW eventual consistency is not a problem unless you use the US Standard region. See http://shlomoswidler.com/2009/12/read-after-write-consistency-in-amazon.html
from deb-s3.
Related Issues (20)
- 0.9.1 removed the --use-ssl flag
- does not match the server certificate
- How to use gpg2? HOT 1
- gpg: cannot open tty `/dev/tty' HOT 1
- Signing package HOT 1
- weak digest algorithm HOT 1
- --fail-if-exists has no effect HOT 5
- InRelease should be generated by default HOT 4
- Prune orphaned packages from s3
- Method missing `public_url` in #<Seahorse::Client::Response> with --fail-if-exists HOT 1
- error in deb-s3 show: wrong number of arguments
- error in deb-s3 verify HOT 2
- S3-backed configuration configuration file?
- Re-genning `deb-s3` binary
- The authorization header is malformed; the Credential is mal-formed; expecting "<YOUR-AKID>/YYYYMMDD/REGION/SERVICE/aws4_request". HOT 2
- Can't upload packages built with Ubuntu Bionic HOT 3
- create signed repo
- Security: deb-s3 incorporates existing release/manifests without verifying signatures HOT 2
- Maintainers: This is not the repo you're looking for! HOT 3
- Add link to the active fork & transfer gem ownership? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deb-s3.