Giter Club home page Giter Club logo

Comments (4)

andreittr avatar andreittr commented on September 24, 2024

Taking inspiration from other projects that do similar things (e.g., buildroot) here's a rough design sketch, along with some open questions:

  • A configurable location for cached archives: per-project, in .config, fallback to an env var UKBUILD_DLCACHE or smth, then fallback to a hardcoded default; it's important this location is outside of the build directory, so it doesn't get thrashed by distclean
    • optional: remove the archives from the build dir and extract them directly from the cache
    • or: symlink archives from cache to build dir (maximum compat w/ existing scripts and/or workflows)
  • Feature should be on by default but offer opt-out -- first-time user gets caching OOTB, paranoid users can turn it off (dedicated Kconfig option?)
  • Index packages: by PACKAGE+VERSION or FULL URL?
    • FULL URL is ostensibly safest (and most straight-forward to implement), but might lead to false cache misses when different mirrors are used for download
    • PACKAGE+VERSION requires more effort but might improve cache utilization (IIRC buildroot does this one)
  • Integrity checking: SHA256sum of the archive. We already support this in the build system; should integrate it. Question is, where to get an authoritative hash from:
    1. hash is in the package's Makefile.uk: marginally more dev effort but ensures all users get the correct packages (this is how buildroot does it)
    2. hash gets computed & stored in the cache after download: easier on devs, but no guarantee we got the right archive
    3. prefer (i) but fallback to (ii) with a warning? or just not cache the package unless it defines a hash in the Makefile.uk. this would help streamline the dev process while keeping the advantages of (i)

Since we're reworking fetch to a large degree with this change, I propose also merging fetch and fetch2 to support an arbitrary number of mirrors, with fallback logic and perhaps some clever autoselection of the best mirror. The latter would very much help with GNU packages whose main server is pretty slow to access from outside the US.

from unikraft.

kubanrob avatar kubanrob commented on September 24, 2024

My usecase is a bit different as I am not interesting a cache, but a prepopulated local directory to avoid any fetching at build time. However, I think this fits pretty well with this feature.

So the best case for me would be:

  • the cache is easy to populate without tooling / without invoking the unikraft build system
  • not special support by the library required
    • apart from the current use of the fetch/fetch2/clone rule
  • fetching can be switched off completely
  • support for the git clone rule if this is more used in the future

How I would intend to use it, for example with app-helloworld and lib-musl:

  • create the cache directory
  • use app-helloworld and the current lib-musl without any modification
  • put the musl source tarball at a specific location in the cache directory
  • configure the buildsystem to
    • do not fetch anything
    • use the cache directory instead

from unikraft.

andreittr avatar andreittr commented on September 24, 2024

Hi @kubanrob ! I believe the proposed implementation would support your use case out-of-the-box; some comments inline.

the cache is easy to populate without tooling / without invoking the unikraft build system

As long as (1) the path to a cached blob is stable and well-defined (which IMO it should be) and (2) the file you supply has the expected name and checksum, I don't see why you couldn't use literally any tool (unikraft or otherwise) to populate this cache. We can make sure (1) holds when designing the cache layout.

not special support by the library required

Aside from the possible requirement/recommendation to specify a checksum (which we already do in some libs), there should be no changes whatsoever to individual library makefiles; all changes should be transparently contained in the main build system. Additionally, the new fetch should be as backwards-compatible as possible, unless there are really compelling arguments to the contrary.

fetching can be switched off completely

While not something I considered, I guess this is pretty easy to impement with a Kconfig option (or, in a pinch, by disabling your internet access).

configure the buildsystem to do not fetch anything & use the cache directory instead

I think we should always go through the cache by default, regardless if configured to fetch or not. Only on a cache miss should unikraft try to fetch from the internet and error out if it fails or is disabled.

from unikraft.

kubanrob avatar kubanrob commented on September 24, 2024

Hi @andreittr!

I believe the proposed implementation would support your use case out-of-the-box

Yes, I would not expect it to be hard to support that usecase with the proposed implementation.
Having this alternative case in mind may is helpful for some implementation detail :)

As long as (1) the path to a cached blob is stable and well-defined (which IMO it should be) and (2) the file you supply has the expected name and checksum, [...] We can make sure (1) holds when designing the cache layout.

That would be a useful property, probably regardless of the usecase.

Aside from the possible requirement/recommendation to specify a checksum (which we already do in some libs), there should be no changes whatsoever to individual library makefiles; all changes should be transparently contained in the main build system.

This is great to hear, the open question about the indexing had me a bit worried about that.

from unikraft.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.