nroi / cpcache Goto Github PK
View Code? Open in Web Editor NEWcentral pacman cache
License: MIT License
central pacman cache
License: MIT License
cpcache still seems to have issues when downloads are aborted (e.g. interrupting pacman while it's downloading a package).
I am using your official PKGBUILD from the AUR to compile cpcache on my pc. For some reason I have the 100kb/s speed limit in prod mode. I don't know how to debug this, as I am not experienced with Elixir. Build log:
...
==> Making package: cpcache-git r340.546868c-1 (Di 23 Apr 2019 21:31:36 CEST)
==> Checking runtime dependencies...
==> Checking buildtime dependencies...
==> Retrieving sources...
-> Updating cpcache git repo...
Fordere an von origin
-> Found sysuser.conf
-> Found cpcache.install
-> Found cpcache.service
==> Validating source files with sha256sums...
cpcache ... Skipped
sysuser.conf ... Passed
cpcache.install ... Passed
cpcache.service ... Passed
==> Removing existing $srcdir/ directory...
==> Extracting sources...
-> Creating working copy of cpcache git repo...
Klone nach 'cpcache' ...
Fertig.
==> Starting pkgver()...
==> Updated version: cpcache-git r364.f827c28-1
==> Sources are ready.
==> Making package: cpcache-git r364.f827c28-1 (Di 23 Apr 2019 21:31:37 CEST)
==> Checking runtime dependencies...
==> Checking buildtime dependencies...
==> WARNING: Using existing $srcdir/ tree
==> Starting pkgver()...
==> Entering fakeroot environment...
==> Starting package()...
* creating /tmp/tmp.ys3FwElx3v/.mix/archives/hex-0.19.0
* creating /tmp/tmp.ys3FwElx3v/.mix/rebar
* creating /tmp/tmp.ys3FwElx3v/.mix/rebar3
* Getting eyepatch (https://github.com/nroi/eyepatch.git)
remote: Enumerating objects: 309, done.
remote: Counting objects: 100% (309/309), done.
remote: Compressing objects: 100% (148/148), done.
remote: Total 309 (delta 145), reused 274 (delta 115), pack-reused 0
Empfange Objekte: 100% (309/309), 49.47 KiB | 610.00 KiB/s, Fertig.
Löse Unterschiede auf: 100% (145/145), Fertig.
Resolving Hex dependencies...
Dependency resolution completed:
Unchanged:
artificery 0.2.6
certifi 2.5.1
distillery 2.0.10
hackney 1.15.1
idna 6.0.0
jason 1.1.1
metrics 1.0.1
mimerl 1.2.0
parse_trans 3.3.0
ssl_verify_fun 1.1.4
toml 0.5.2
unicode_util_compat 0.4.1
* Getting distillery (Hex package)
* Getting hackney (Hex package)
* Getting toml (Hex package)
* Getting jason (Hex package)
* Getting certifi (Hex package)
* Getting idna (Hex package)
* Getting metrics (Hex package)
* Getting mimerl (Hex package)
* Getting ssl_verify_fun (Hex package)
* Getting unicode_util_compat (Hex package)
* Getting parse_trans (Hex package)
* Getting artificery (Hex package)
===> Compiling parse_trans
===> Compiling mimerl
===> Compiling metrics
===> Compiling unicode_util_compat
===> Compiling idna
==> eyepatch
Compiling 1 file (.ex)
Generated eyepatch app
==> jason
Compiling 8 files (.ex)
Generated jason app
==> ssl_verify_fun
Compiling 7 files (.erl)
Generated ssl_verify_fun app
===> Compiling certifi
===> Compiling hackney
==> artificery
Compiling 10 files (.ex)
Generated artificery app
==> distillery
Compiling 33 files (.ex)
warning: variable "err" is unused (if the variable is not meant to be used, prefix it with an underscore)
lib/mix/lib/releases/runtime/pidfile.ex:50
warning: "else" shouldn't be used as the only clause in "defp", use "case" instead
lib/mix/lib/releases/checks.ex:68
Generated distillery app
==> toml
Compiling 10 files (.ex)
Generated toml app
==> cpcache
Compiling 11 files (.ex)
Generated cpcache app
==> Assembling release..
==> Building release cpcache:0.1.0 using environment prod
==> Including ERTS 10.3.1 from /usr/lib/erlang/erts-10.3.1
==> Packaging release..
Release successfully built!
To start the release you have built, you can use one of the following tasks:
# start a shell, like 'iex -S mix'
> _build/prod/rel/cpcache/bin/cpcache console
# start in the foreground, like 'mix run --no-halt'
> _build/prod/rel/cpcache/bin/cpcache foreground
# start in the background, must be stopped with the 'stop' command
> _build/prod/rel/cpcache/bin/cpcache start
If you started a release elsewhere, and wish to connect to it:
# connects a local shell to the running node
> _build/prod/rel/cpcache/bin/cpcache remote_console
# connects directly to the running node's console
> _build/prod/rel/cpcache/bin/cpcache attach
For a complete listing of commands and their use:
> _build/prod/rel/cpcache/bin/cpcache help
==> Tidying install...
-> Removing libtool files...
-> Purging unwanted files...
-> Removing static library files...
-> Stripping unneeded symbols from binaries and libraries...
-> Compressing man and info pages...
==> Checking for packaging issues...
==> WARNING: Package contains reference to $srcdir
usr/share/cpcache/lib/hackney-1.15.1/ebin/hackney_util.beam
usr/share/cpcache/lib/hackney-1.15.1/ebin/hackney_manager.beam
usr/share/cpcache/lib/hackney-1.15.1/ebin/hackney_app.beam
usr/share/cpcache/lib/hackney-1.15.1/ebin/hackney_cookie.beam
usr/share/cpcache/lib/hackney-1.15.1/ebin/hackney_headers.beam
... (many files getting installed)
usr/share/cpcache/lib/artificery-0.2.6/ebin/Elixir.Artificery.Console.Table.beam
usr/share/cpcache/lib/artificery-0.2.6/ebin/Elixir.Artificery.Entry.beam
usr/share/cpcache/lib/artificery-0.2.6/ebin/Elixir.Artificery.Option.beam
usr/share/cpcache/lib/artificery-0.2.6/ebin/Elixir.Artificery.Console.Prompt.beam
==> Creating package "cpcache-git"...
-> Generating .PKGINFO file...
-> Generating .BUILDINFO file...
-> Adding install file...
-> Generating .MTREE file...
-> Compressing package...
==> Leaving fakeroot environment.
==> Finished making: cpcache-git r364.f827c28-1 (Di 23 Apr 2019 21:32:06 CEST)
==> Cleaning up...
...
So far, I haven't found a method to reproduce this issue.
Example: Client initiates download request of file community/os/x86_64/python-alembic-1.0.7-1-any.pkg.tar.xz. The download then stalls and the client aborts. Server log (cpcache) says:
Feb 10 21:33:43 archWS cpcache[22589]: 21:33:43.646 [info] Serve file /var/cache/cpcache/pkg/community/os/x86_64/python-alembic-1.0.7-1-any.pkg.tar.xz via HTTP.
Feb 10 21:33:59 archWS cpcache[22589]: 21:33:59.654 [info] Connection closed by client during data transfer. File /var/cache/cpcache/pkg/community/os/x86_64/python-alembic-1.0.7-1-any.pkg.tar.xz is incomplete.
The file size on the cpcache server for the given file is 245515 bytes. The actual file size is 246900 bytes.
Hello,
In my environment, a central server might not always be online, reachable or might have a lot of disk space for caching. It would be great, if it were possible to distribute the cache over several machines in a LAN.
The finding of other mirrors could be done via mDNS using a service type specific for cpcache.
The servers could be load balanced client side depending on the network speed towards the client. If the network topology was to be detected using lldp, clients could be directed to the closest mirror in order to avoid bottlenecks.
The cached files could be balanced or replicated among several caches.
Authentication of mirrors could be done via TLS server certificates from a personal PKI.
That is very advanced though and I doubt that there is anyone running such a big LAN, that this is required.
What do you think?
I have one other question: How does cpcache handle the repo database files?
Generated cpcache app ** (Mix) Could not invoke task "release": 1 error found! --env : Unknown option ==> ERROR: A failure occurred in package(). Aborting... Error making: cpcache-git
Avoid the latency imposed by re-connecting to the same host when the client downloads more than one file (which is usually the case…)
See also: https://github.com/benoitc/hackney#reuse-a-connection
Mint is very low-level, so using it could lead to fewer bugs due to:
So far, we have avoided HTTP client libraries that are based on message passing: We wanted the library to allow us to stream the download directly into a file, rather than sending a message for each chunk of data. That was due to performance constraints: We wanted cpcache to be able to run on low-end hardware like the first Raspberry Pi. This might no longer be such big an issue: Some users reported using cpcache on their workstation (rather than a Raspberry Pi or some low-end device). Also, newer versions of the Raspberry Pi are considerably more powerful, so even with message passing, cpcache might not turn into a resource hog.
First, we need to evaluate the performance impact of using mint when downloading large files. Here's an example snippet of using mint for downloading a file:
defp receive_forever(conn) do
conn = receive do
message ->
{:ok, conn, responses} = Mint.HTTP.stream(conn, message)
data = Enum.find_value(responses, fn
{:data, _ref, data} -> data
_ -> false
end)
:ok = File.write("/tmp/file", data, [:append])
conn
end
receive_forever(conn)
end
@tag :mint
def dowload()
host = "download.media.tagesschau.de"
path = "/video/2019/0706/TV-20190706-1647-0201.webm.h264.mp4"
headers = []
{:ok, conn} = Mint.HTTP.connect(:https, host, 443)
{:ok, conn, request_ref} = Mint.HTTP.request(conn, "GET", path, headers, "")
receive_forever(conn)
end
With the current approach, there is a trade off concerning the mirrors_auto.num_mirrors
variable:
Perhaps there is a way to do a sort of fast preselection before the actual latency tests are run. For instance, ICMP could be used to ping hosts and sort them by their round trip time.
Even if we try to take care to choose fast mirrors, it's always possible that the chosen mirror ends up being too slow. This is particularly frustrating when a large file is downloaded and the speed suddenly nosedives.
Ideally, cpcache should be able to switch to a (hopefully) better mirror while the download is still in progress. All of this happens transparently to the client. i.e., the client will just see that the speed is first slow, and then suddenly increases.
Things to consider:
Suppose client A starts to download a file, and client B begins to download the same file a few seconds later, while the download of client A still hasn't finished. No new connection to the remote mirror is started (which is intended), instead, client B uses the download initiated by client A. However, if client A now interrupts the download (i.e., closes the connection), the download will stall for client B.
When building a package with the official PKGBUILD it errors with the stacktrace below.
** (Mix.Releases.Config.LoadError) could not load release config rel/config.exs
** (UndefinedFunctionError) function Mix.Config.Agent.start_link/0 is undefined (module Mix.Config.Agent is not available)
Mix.Config.Agent.start_link()
(stdlib) erl_eval.erl:680: :erl_eval.do_apply/6
(stdlib) erl_eval.erl:449: :erl_eval.expr/5
(stdlib) erl_eval.erl:126: :erl_eval.exprs/5
(elixir) lib/code.ex:232: Code.eval_string/3
(distillery) lib/mix/lib/releases/config/config.ex:281: Mix.Releases.Config.read_string!/1
(distillery) lib/mix/lib/releases/config/config.ex:302: Mix.Releases.Config.read!/1
I am not experienced with Elixir and therefore don't know what this error means.
cpcache fetches a JSON document from https://www.archlinux.org/mirrors/status/json/
. The fallback strategy when this URL is not available is to look for a locally cached version of the mirror status.
This obviously only works when cpcache has been started before. A nicer approach would be to make a cached version of the JSON document available on the web (e.g. gist.github.com) and use this as a fallback. This would also be cleaner from a DevOps perspective: We could run integration tests with Docker that would succeed even if archlinux.org is down.
I'm loving this project so far ! Just one quick question. Would it be possible for cpcache to use AUR compiled builds ? I use yay for my AUR helper and was wondering if point cpcache to this cache directory would do the trick. Thanks
I noticed the following issue after bumping to the latest git commit. I have tested it on two completely different networks (home and work), so this is definitely not a network issue. Also, while this shows on my router instance running archlinuxarm in lxc, I have been able to reproduce it on my laptop via AUR install as well vanilla Docker as provided in the repo. Tag 0.1.6 shows the same.
19:23:38.024 [warn] Unable to fetch mirror data from https://www.archlinux.org/mirrors/status/json/: {:error, {:option, :server_only, :honor_cipher_order}}
19:23:38.024 [warn] Retry in 5000 milliseconds
19:23:43.024 [warn] Max. number of attempts exceeded. Checking for a cached version of the mirror data…
19:23:43.025 [warn] No mirror data found in cache.
19:23:43.025 [error] GenServer Cpc.MirrorSelector terminating
** (RuntimeError) Unable to fetch mirror statuses
(cpcache) lib/cpc/mirror_selector.ex:125: Cpc.MirrorSelector.handle_info/2
(stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
(stdlib) gen_server.erl:711: :gen_server.handle_msg/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: :init
Elixir 1.9 has build-in release functionality. Use this instead of distillery.
Use
@tag integration
for integration tests. This will make it much easier to gauge if a failed test case is due to a newly introduced bug, a website that went down, or just due to the test case being non-deterministic. The idea is inspired by mint, the http client library.
error: failed retrieving file 'python-pycparser-2.19-1-any.pkg.tar.xz' from X.X.X.X:7070 : The requested URL returned error: 500
This only appears on 0.1.6 not 0.1.5
Unsure of what the breaking change is caused by
I have set my client to use my cpcache server as the mirror. From time to time I get these error 500 when downloading the packages. Retrying is not a problem but it can be annoying. Can we raise the timeout higher? I want to be able to customize it.
The current approach of using fallback mirrors (in case the json document could not be fetched) is brittle. The user would need to update the toml file once a mirror is left unmaintained, but some cpcache users are not even aware of the toml config file. Other options:
https://www.archlinux.org/mirrors/status/json/
cannot be reached and the user has no cached version of the json document.First off, I love this project. Thanks a million for making it public. I build throwaway Arch VMs using Packer and cpcache makes it super fast.
I use reflector to update my mirrorlist on my cpcache server. I select the 50 closest mirrors near me.
In instances where a package updates, such as i3-gaps yesterday, cpcache couldn't find the package and the client machine was getting 404 errors. I updated the mirrorlist of 50 on the cpcache server and restarted, but I still couldn't get the package.
I copied the mirrorlist to the client (that normally calls cpcache) and was able to get the package. This indicates that I don't really understand how cpcache works with mirrors.
For clients on my LAN, should I be able to have just a single mirrorlist entry that points to my cpcache server? If the cpcache server has it's own mirrorlist, having more mirrors on the client seems redundant. Ideally my clients have just the cpcache mirror, and I manage the mirrorlist on the cpcache server only.
Is this possible using cpcache, or am I misunderstanding how it manages it's own mirrors?
Thanks again
The file client_request.ex
is rather large and it violates the single responsibility principle. It includes HTTP Web server functionality, e.g. with callbacks such as
def handle_info(
{:http, _, {:http_request, :GET, {:abs_path, "/"}, _}},
state = %CR{action: :recv_header}
) do
but also other application logic, e.g. with callbacks such as
def handle_cast({:filesize_increased, _},
state = %CR{waiting_for_no_dependencies_left: true}) do
Try to put the HTTP server functionality into a separate module. Write test cases for this module from the start, i.e., don't commit any additional functions unless there's a test cases that covers this new function.
It seems that programs like CURL try IPv6 first:
wget -v 'http://localhost:7070/extra/os/x86_64/firefox-66.0.3-1-x86_64.pkg.tar.xz' -O /dev/null %
--2019-04-23 22:28:40-- http://localhost:7070/extra/os/x86_64/firefox-66.0.3-1-x86_64.pkg.tar.xz
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:7070... failed: Connection refused.
Connecting to localhost (localhost)|127.0.0.1|:7070... connected.
For testing, it would be useful to have a log that stores which files have been downloaded, and whether they have been served via HTTP, via cache, or both (first cache, then HTTP). This would make it possible to test not only that we receive the correct file (sha256 checksums match), but also to test that files are really served from cache when possible.
Is there an intention to eventually support arm devices as clients? My host OS is from https://archlinuxarm.org/. It would really help when I'm running containers on it.
This is an interesting project.
Unfortunately, cpcache crashes the first time I run it. I am using the AUR package with an exact copy of your conf/cpcache.toml to keep it as default as possible for troubleshooting purposes. When starting the systemd service it crashes with:
May 04 14:13:16 dell cpcache[2685]: "$@" -- "${1+$ARGS}"
May 04 14:13:16 dell cpcache[2685]: 14:13:16.765 [info] Application cpcache exited: exited in: Cpc.start(:normal, [])
May 04 14:13:16 dell cpcache[2685]: ** (EXIT) an exception was raised:
May 04 14:13:16 dell cpcache[2685]: ** (UndefinedFunctionError) function Jerry.decode!/1 is undefined (module Jerry is not available)
May 04 14:13:16 dell cpcache[2685]: Jerry.decode!("cache_directory = \"/var/cache/cpcache\"\n\n# Bind to an IPv6 address, in addition to a>
May 04 14:13:16 dell cpcache[2685]: (cpcache) lib/cpc.ex:17: Cpc.init_config/0
May 04 14:13:16 dell cpcache[2685]: (cpcache) lib/cpc.ex:62: Cpc.start/2
May 04 14:13:16 dell cpcache[2685]: (kernel) application_master.erl:273: :application_master.start_it_old/4
May 04 14:13:18 dell cpcache[2685]: {"Kernel pid terminated",application_controller,"{application_start_failure,cpcache,{bad_return,{{'Elixir.Cpc',sta>
May 04 14:13:18 dell cpcache[2685]: Kernel pid terminated (application_controller) ({application_start_failure,cpcache,{bad_return,{{'Elixir.Cpc',star>
May 04 14:13:18 dell cpcache[2685]: [1B blob data]
May 04 14:13:18 dell cpcache[2685]: Crash dump is being written to: erl_crash.dump...
May 04 14:13:18 dell systemd[1]: cpcache.service: Main process exited, code=exited, status=1/FAILURE
May 04 14:13:18 dell systemd[1]: cpcache.service: Failed with result 'exit-code'.
It seems to create the directory structure okay:
ls -al /var/cache/cpcache
total 20
drwxr-xr-x 5 cpcache cpcache 4096 May 4 14:12 .
drwxr-xr-x 10 root root 4096 May 4 14:12 ..
drwxr-xr-x 6 cpcache cpcache 4096 May 4 14:12 arm
drwxr-xr-x 2 cpcache cpcache 4096 May 4 14:12 mnesia
drwxr-xr-x 7 cpcache cpcache 4096 May 4 14:12 x86
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.