Giter Club home page Giter Club logo

Comments (15)

mindstorm38 avatar mindstorm38 commented on May 31, 2024 2

Main function and class are implemented! Tomorrow I will work on using this new function and testing performances.

from portablemc.

mindstorm38 avatar mindstorm38 commented on May 31, 2024 1

I was thinking about the implementation of this, in python, threads will not solve the problem because of the GIL (global interpreter lock), I will need to use sub-processes instead.

By default, the number of processes will be the number or processor cores (or processor threads?), to configure that I was thinking of an argument --download-workers <count> in the main command. This will be implemented in a new download function for bundles of files, the old single-file download function will be deprecated.

from portablemc.

mindstorm38 avatar mindstorm38 commented on May 31, 2024 1

As you can see in the branch feature/improved-http I'm using http.client, this is the lowest level I can reach, and it's really practical, so I will use that. For requests, it's way better than the std lib but I want to avoid external lib.

I wouldn't be surprised if most of the current overhead is from initiating connections

Me neither! I will explore this solution!

from portablemc.

mindstorm38 avatar mindstorm38 commented on May 31, 2024 1

The following benchmarks time the full installation of the game, as well as the download of the appropriate JVM for Minecraft 1.17.
For 1.17, the total download size is 512,096,029 bytes.

Benchmarks:

  • New download system: 3min07s (2.74 MB/s)
  • Old download system (with richer add-on): 8min09s (1.05 MB/s)
  • Old download system (without richer add-on): 6min22s (1.34 MB/s)
  • Official launcher: between 3m10s and 4m00s (from click on start button to the finalization)

These results are fantastic! I was not expecting this to be two times faster or as fast as the official launcher, which runs native code.

The internal code is already ready for a future improvement using multi-threaded downloading. It also tells us that the official launcher seems to only use one process.

Now I need to remove the old system and rework terminal messages.

from portablemc.

Ristovski avatar Ristovski commented on May 31, 2024 1

Very impressive! Having it be on-par with the official launcher is amazing. Good job :)

from portablemc.

mindstorm38 avatar mindstorm38 commented on May 31, 2024 1

A real issue will be reopened specifically for the parallel download. Release 1.4 soon

from portablemc.

mindstorm38 avatar mindstorm38 commented on May 31, 2024

I don't know how the official launcher download assets, but this is a good idea! I reserve it for the second next release.

from portablemc.

Ristovski avatar Ristovski commented on May 31, 2024

Perhaps concurrent.futures or multiprocessing can be used to implement a thread worker pool, but as you said, it would need to support limiting the number of threads.

Another thing is making sure the logging output does not come in out of order since that could break any progress indicators.

from portablemc.

mindstorm38 avatar mindstorm38 commented on May 31, 2024

Yes the logging output concerns me, this will need a little of synchronization that is enough customizable to be adapted in "richer" add-on.

from portablemc.

mindstorm38 avatar mindstorm38 commented on May 31, 2024

This is harder than I expected. I don't want to make this slower than before for the same amount of file. Python is not well-designed for multiprocessing, and transferring data to another processes is not simple (especially when it comes to passing strings in shared memory).

Because of that, I will begin with an improvement of the sequential download. The current sequential downloading is not really optimized because I'm opening and closing an HTTP tunnel for every file. However, I don't know how much time it will save, but this is a first step.

I had in mind a downloading system where all needed files are listed and then downloaded together in the same HTTP tunnel (or more tunnels for each CDN domain).

from portablemc.

Ristovski avatar Ristovski commented on May 31, 2024

Something like this should work I think:

from multiprocessing.pool import ThreadPool

urls = [] # list of asset urls
NUM_THREADS = 4

def download_asset(url):
    [...]

ThreadPool(NUM_THREADS).imap_unordered(download_asset, urls)

from portablemc.

mindstorm38 avatar mindstorm38 commented on May 31, 2024

What I don't like about this solution is that it is not possible to keep a state between calls. I would prefer a system where a few processes stay running and processing downloads from a shared list or some pipe with the connection open.

from portablemc.

Ristovski avatar Ristovski commented on May 31, 2024

Then I suppose using requests.session() (or some urllib equivalent for enabling http keep-alive) could perhaps be enough for now? I wouldn't be surprised if most of the current overhead is from initiating connections

from portablemc.

mindstorm38 avatar mindstorm38 commented on May 31, 2024

Pre-release sson :)

from portablemc.

mindstorm38 avatar mindstorm38 commented on May 31, 2024

Please test the feature or fix in the latest pre-release.
Comment this issue if the problem is fixed or not. This issue will be closed before the final release 1.1.4 if no problems are discovered.

from portablemc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.