tuna / tunasync-scripts Goto Github PK
View Code? Open in Web Editor NEWCustom scripts for mirror jobs
Custom scripts for mirror jobs
Exception handling in bash is extremely difficult and in-consistent, so keep calm and rewrite them with python.
To mirror Lean4 the task is split into:
init scripts: elan-init.sh, elan-init.ps1
The mirror can change variable ELAN_UPDATE_ROOT
or ElanRoot
to the mirrored one. The request URL structure is exactly what GitHub release like.
In the Elan repo, src/elan-dist/src/manifestation.rs
and src/elan-dist/src/dist.rs
should take config custom URL like what rustup
had done. (See src/config.rs
)
elan
read env varsIt would be better to direct require
from tuna mirror. There should have some recursive modification automatically.
See https://github.com/leanprover-community/mathlib4#building-html-documentation
The Mathlib4 cache is stored in Azure blob storage. It can be replace by an Azure compatible server.
cache
read env varI have draft some checkboxes above to make a initial plan for mirror Lean4 ecosystem. If Tuna is willing for mirroring the Lean4 ecosystem which would be a great help!
It would be better if there is some people more familiar with Tuna mirror system. If someone is not available to approach them I can do most of above job, once I learned how to debug and test the Tuna mirror system. I have basic skill for Lean4 and general programming and I think I can do the programming task at both side, Tuna and the Lean4 ecosystem...
cloud/pytorch 缺少osx-arm64目录( https://conda.anaconda.org/pytorch/osx-arm64/)
msys2.sh using lftp command ,it make the timestamp of local file is is different from the one of the file on sever. I don't know how to make the timestamps same,maybe the lftp configure file is different. I need some help and tips.
~/.condarc 配置如下:
--------------------------------------
channels:
show_channel_urls: true
default_channels:
custom_channels:
conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
bioconda: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
intel: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
--------------------------------------
出现404
Collecting package metadata (current_repodata.json): failed
UnavailableInvalidChannel: The channel is not accessible or is invalid.
channel name: intel
channel url: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/intel
error code: 404
You will need to adjust your conda configuration to proceed.
Use conda config --show channels
to view your configuration's current state,
and use conda config --show-sources
to view config file locations.
AdoptOpenJDK 目前似乎是从v2版本同步的,而官方文档已经建议尽可能快的迁移到v3版本: https://api.adoptopenjdk.net/README
https://github.com/tuna/tunasync-scripts/blob/master/hackage.sh
cabal 2.0.0.0 更新后出现的
$ cabal update -v3
no user package environment file found at /Users/eccstartup
Trying to locate mirrors via DNS for initial bootstrap of secure repository
'http://hackage.haskell.org/' ...
Searching for nslookup in path.
Found nslookup at /usr/bin/nslookup
/usr/bin/nslookup '-query=TXT' _mirrors.hackage.haskell.org
located 2 mirrors for http://hackage.haskell.org/ :
- http://hackage.fpcomplete.com/
- http://objects-us-west-1.dream.io/hackage-mirror/
Selected mirror http://hackage.haskell.org/
Downloading root
Searching for curl in path.
Found curl at /usr/bin/curl
Searching for powershell in path.
Cannot find powershell on the path
Searching for wget in path.
Found wget at /usr/local/bin/wget
Selected http transport implementation: curl
/usr/bin/curl 'http://hackage.haskell.org/root.json' --output /var/folders/tb/wpytxqpx111fsxk0tg1zsmgm0000gn/T/transportAdapterGet24956-1 --location --write-out '%{http_code}' --user-agent 'cabal-install/2.0.0.0 (osx; x86_64)' --silent --show-error --dump-header /var/folders/tb/wpytxqpx111fsxk0tg1zsmgm0000gn/T/curl-headers24956-2.txt --header 'Cache-Control: no-transform'
Downloading the latest package list from hackage.haskell.org
Selected mirror http://hackage.haskell.org/
Downloading timestamp
/usr/bin/curl 'http://hackage.haskell.org/timestamp.json' --output /var/folders/tb/wpytxqpx111fsxk0tg1zsmgm0000gn/T/transportAdapterGet24956-4 --location --write-out '%{http_code}' --user-agent 'cabal-install/2.0.0.0 (osx; x86_64)' --silent --show-error --dump-header /var/folders/tb/wpytxqpx111fsxk0tg1zsmgm0000gn/T/curl-headers24956-5.txt --header 'Cache-Control: no-transform'
Downloading snapshot
/usr/bin/curl 'http://hackage.haskell.org/snapshot.json' --output /var/folders/tb/wpytxqpx111fsxk0tg1zsmgm0000gn/T/transportAdapterGet24956-7 --location --write-out '%{http_code}' --user-agent 'cabal-install/2.0.0.0 (osx; x86_64)' --silent --show-error --dump-header /var/folders/tb/wpytxqpx111fsxk0tg1zsmgm0000gn/T/curl-headers24956-8.txt --header 'Cache-Control: no-transform'
Downloading mirrors
/usr/bin/curl 'http://hackage.haskell.org/mirrors.json' --output /var/folders/tb/wpytxqpx111fsxk0tg1zsmgm0000gn/T/transportAdapterGet24956-10 --location --write-out '%{http_code}' --user-agent 'cabal-install/2.0.0.0 (osx; x86_64)' --silent --show-error --dump-header /var/folders/tb/wpytxqpx111fsxk0tg1zsmgm0000gn/T/curl-headers24956-11.txt --header 'Cache-Control: no-transform'
Cannot update index (no local copy)
Downloading index
/usr/bin/curl 'http://hackage.haskell.org/01-index.tar.gz' --output /var/folders/tb/wpytxqpx111fsxk0tg1zsmgm0000gn/T/transportAdapterGet24956-13 --location --write-out '%{http_code}' --user-agent 'cabal-install/2.0.0.0 (osx; x86_64)' --silent --show-error --dump-header /var/folders/tb/wpytxqpx111fsxk0tg1zsmgm0000gn/T/curl-headers24956-14.txt --header 'Cache-Control: no-transform'
可以看到,新出现了
注意,最后一个是 01-
不是 00-
其他问题还没有注意到,https://github.com/tuna/tunasync-scripts/blob/master/stackage.py 目前没发现太大影响。
Currently, deb_14.x/pool/main/n/nodejs/nodejs_14.8.0-deb-1nodesource1_amd64.deb
exists on the upstream server. However, it does not show up in https://deb.nodesource.com/node_14.x/pool/main/n/nodejs/
tunasync-scripts/nodesource.sh
Lines 3 to 20 in d5f2142
ustc的ubuntu-old-release的挂了,官方的慢的不行,aliyun的不提供rsync服务,求加一个用wget下载阿里云源的脚本
HEAD
reference with remoteI have tried https://mirrors.tuna.tsinghua.edu.cn/github-release/hovancik/stretchly
but unluckily got 404 。
So how this work ? How can I use it ?Is there a guide
createrepo_c is the C re-implementation of createrepo.
createrepo itself is written in Python, when the --update option is used all nodes are read into RAM, which is affected by the inefficiency of Python's objects storage. This is the reason of the high memory usage of genpkgmetadata process, affecting mysql, grafana, influxdata, kubernets and other yum-sync based repos.
createrepo_c is first included in Debian bullseye, replacing the createrepo package in buster.
Line 135 in a57f978
Line 150 in a57f978
Line 50 in a57f978
Line 104 in a57f978
https://2.python-requests.org//zh_CN/latest/user/advanced.html#timeout
Line 216 in d5f2142
[2020-08-13 22:36:59,004] [INFO] Syncing installers...
[2020-08-13 22:36:59,005] [INFO] Start syncing https://repo.continuum.io/archive
[2020-08-13 22:37:01,967] [ERROR] Failed to sync installers of archive
Traceback (most recent call last):
File "/home/scripts/anaconda.py", line 270, in main
sync_installer(remote_url, local_dir)
File "/home/scripts/anaconda.py", line 216, in sync_installer
remote_filesize = int(r.headers['content-length'])
File "/usr/local/lib/python3.7/dist-packages/requests/structures.py", line 54, in __getitem__
return self._store[key.lower()][1]
KeyError: 'content-length'
ref: tuna/issues#915
按tunasync的代码,upstream通过环境变量TUNASYNC_UPSTREAM_URL
传入。当前pypi.sh
使用的变量是TUNASYNC_UPSTREAM
能否增加 nvidia channel?
https://anaconda.org/nvidia/repo
should we add nvidia channel?
there is a lot of dependencies package of nvidia repo?
cudf这个源,GitHub地址:https://github.com/rapidsai/cudf 有机会进入清华源吗?
https://github.com/tuna/tunasync-scripts/blob/master/lxc-images.sh
The following files should be moved to the correct place AFTER all the files at images
folder are downloaded.
index-system index-system.asc index-user index-user.asc
See also: tuna/issues#558
例如: https://mirrors.tuna.tsinghua.edu.cn/ohmyzsh.git/
假设也是用git.sh进行的同步,同步之后是怎么提供的服务?
谢谢。
Large amount of read io could be observed while running nix-channels.py, maybe some performance improvments could be implemented to reduce server loads.
Suppose Size
comes after Checksum
.
tunasync-scripts/helpers/apt-download
Lines 133 to 134 in eb8be5e
最近把Julia的镜像代码重写了一下,调用方式做了一些改变,所以需要更新一下。
更新:
/clones
和 /registries
的需求,移除了对git
的需求/julia/static/.cache
中)来避免增量同步时的不必要的CPU和IO开销/julia/static/failed_resources.txt
里,在24小时内进行增量同步时会略过这里面记录的资源,这样能大大加快增量的效率。不再需要 /julia/clones
和 /julia/registries
这两个文件夹了,所以理论上可以把/julia/static
挂载到 /julia
下,但不太确定能不能以一种兼容的方式实现。
@z4yx 我不太确定这个应该怎么改进tunasync的脚本里,所以可能需要你们来帮忙,#81 给了一个参考
上游服务器现在有两个:https://kr.storage.juliahub.com
以及 https://us-east.storage.juliahub.com
,可以两个都添加进来,也可以只选择一个。kr
(韩国首尔)服务器则采用了优化后的构建代码,所以从Github注册表同步的延迟更低,而 us-east大概有30-60分钟的延迟。(Ref: JuliaRegistries/General#16777 (comment))
gen_ubuntu_exclude
的思路呢。exclude_file
被识别是 endwith, 还是正则整个url路径Current GC logic deletes dead narinfo files along with the nar files they referring to. But live narinfo files can still refer to these deleted nar files, because multiple narinfo files may have same URL.
tunasync-scripts/nix-channels.py
Lines 429 to 438 in a47eb03
Narinfo file example:
# github:nixos/nixpkgs/e10da1c7f542515b609f8dfbcf788f3d85b14936#element-web
$ curl https://cache.nixos.org/hdz90ld1wwj6nwp580avv02v62cjh7h3.narinfo
StorePath: /nix/store/hdz90ld1wwj6nwp580avv02v62cjh7h3-element-web-1.10.10
URL: nar/0ji0p1g4fjwpqmf33maf7irznb2wclzi483hicyyjifybh43qxrs.nar.xz
Compression: xz
FileHash: sha256:0ji0p1g4fjwpqmf33maf7irznb2wclzi483hicyyjifybh43qxrs
FileSize: 13688456
NarHash: sha256:0r3pdgcnrlvf56cxlgxdsai5f1k9pf7cq80zssrbfbabrasbkk2v
NarSize: 43380024
References:
Deriver: af669vfa956f7znxarma0rf1883nxy8k-element-web-1.10.10.drv
Sig: cache.nixos.org-1:T0b6vIGHgFC2E9SspFsf/MMubYixGGYna/JOssawEaQh4jdoyJNAkRNN9pJDFJDTPKER7DCXAVXE5zgzYedPBw==
# github:nixos/nixpkgs/c30945a93fbd3122a55ee6a63c9bfef7556bc82e#element-web
$ curl https://cache.nixos.org/zgwkj12lfii1ii041497bxm8rzcx23sd.narinfo
StorePath: /nix/store/zgwkj12lfii1ii041497bxm8rzcx23sd-element-web-1.10.10
URL: nar/0ji0p1g4fjwpqmf33maf7irznb2wclzi483hicyyjifybh43qxrs.nar.xz
Compression: xz
FileHash: sha256:0ji0p1g4fjwpqmf33maf7irznb2wclzi483hicyyjifybh43qxrs
FileSize: 13688456
NarHash: sha256:0r3pdgcnrlvf56cxlgxdsai5f1k9pf7cq80zssrbfbabrasbkk2v
NarSize: 43380024
References:
Deriver: 51mif1gp7i3igq8d9a9aff073qn9drd4-element-web-1.10.10.drv
Sig: cache.nixos.org-1:kAT7/4P/GNm0kmFoTss6klYET4ZdYjNhs7OW3q3jDIQMXZqiTdXyZ8uLW0k2eiyuusgdBOSHzvZpcRjxWUFXBw==
Names of nar archives on https://cache.nixos.org are their base32 hashs. For example:
$ nix hash file <(curl https://cache.nixos.org/nar/1jh8kd7ql2v73fdspp1v08hvfdfxjvl2i86g409ckgpy8k6l41g8.nar.xz) | \
xargs nix hash to-base32
1jh8kd7ql2v73fdspp1v08hvfdfxjvl2i86g409ckgpy8k6l41g8
Oct 27 20:47:12 ab497fe801fb ftpsync-debian[7]: Mirrorsync start
Oct 27 20:47:12 ab497fe801fb ftpsync-debian[7]: Running mirrorsync, update is required, /data/mirrors/debian//Archive-Update-Required-nanomirrors.tuna.tsinghua.edu.cn exists
tail: tail: cannot open '/home/log/tunasync/ftpsync/rsync-ftpsync-debian.log' for reading: No such file or directorycannot open '/home/log/tunasync/ftpsync/rsync-ftpsync-debian.error' for reading
: No such file or directorytail:
no files remaining
tail: no files remaining
/home/bin/ftpsync-wrapper.sh: line 1: kill: (30) - No such process
配置:
#/etc/tunasync/mirrors.conf.d/boost.conf
[[mirrors]]
name = "boost.git"
provider = "command"
command = "/home/tunasync-scripts/git-recursive.sh"
upstream = "https://github.com/boostorg/boost.git"
docker_image = "tunathu/tunasync-scripts:latest"
size_pattern = "size-pack: ([0-9\\.]+[KMGTP])"
[mirrors.env]
MIRROR_BASE_URL="http://mirror.example.com/"
WORKING_DIR_BASE="/srv/git-mirror/"
GENERATED_SCRIPT="/srv/git-mirror/boost-git.sh"
RECURSIVE="1"
worker日志提示该镜像任务Success,但查看内容时发现,实际只同步了url中指定的仓库。而里面submodules的其他仓库没有被正确执行同步。
Line 5 in d5f2142
ref: tuna/issues#910
The last modify time of py3/redhat/8/x86_64/latest/repodata/repomd.xml
on our server is May 4, and is Aug 14 according to the headers of the file in question.
rsync同步centos镜像是可以增量对比的,但是docker-ce.py没有增量对比功能,每次运行docker-ce.py都会从头开始每个文件再次对比(已存在的话会Skipping),耗时非常长,一旦某些原因导致连接中断,又要从头来。
请问这是这样的现象吗?是否有什么更好的解决方案?谢谢
rt; apt-sync基本上已经能做到,但是yum-sync只能下载特定的component
的包。
预期达到的效果,能通过 yum-sync 将对应的目录都脱下来存储。例如 将 https://mirrors.tuna.tsinghua.edu.cn/centos/7/os/x86_64/Packages/
类似于 apt-sync
的 binnay-amd64 格式拖出来。
==> Downloading https://mirrors.tuna.tsinghua.edu.cn/linuxbrew-bottles/bottles/openssl%401.1-1.1.1h.x86_64_linux.bottle.tar.gz
curl: (22) The requested URL returned error: 404
Error: Failed to download resource "[email protected]"
Download failed: https://mirrors.tuna.tsinghua.edu.cn/linuxbrew-bottles/bottles/openssl%401.1-1.1.1h.x86_64_linux.bottle.tar.gz
But linuxbrew.bintray.com is ok
==> Downloading https://linuxbrew.bintray.com/bottles/openssl%401.1-1.1.1h.x86_64_linux.bottle.tar.gz
==> Downloading from https://akamai.bintray.com/61/61bf82b4b62e07589dec1fdc9eb9eb053da01565581fb18dd5df6232182238ec?__gda__=exp=1606235004~hmac=e668bdb2312764a6996a5146431a77ac0b5acea782fe86812e34fefc47c74687&re
######################################################################## 100.0%
当前HACKAGE PACKAGE INDEX为01-index.tar.gz
tunasync-scripts/helpers/apt-download
Line 153 in eb8be5e
tunasync-scripts/virtualbox.sh
Line 91 in eb8be5e
由于不知道传入参数,我当前使用宿主机调用command,接着在shell里面调用了 docker; 还在抓,但是我不清楚会不会丢失了 size-sum
不了解这个传入机制
#!/bin/bash
# requires: docker
set -e
set -o pipefail
_here=`dirname $(realpath $0)`
BASE_PATH="${TUNASYNC_WORKING_DIR}"
BASE_URL=${TUNASYNC_UPSTREAM_URL:-"http://update.cs2c.com.cn:8080"}
export REPO_SIZE_FILE=/tmp/reposize.$RANDOM
docker run --rm \
-v $BASE_PATH:/mirrors/cs2c \
-v /home/scripts/:/home/scripts/ \
tunathu/tunasync-scripts \
/bin/bash /home/scrips/cs2c_ns.sh $BASE_URL
echo "YUM finished"
"${_here}/helpers/size-sum.sh" $REPO_SIZE_FILE --rm
Bullseye (Debian 11 - next stable) - Last update : Wed, 18 Aug 2021 01:33:06 UTC / Revision: 20210817071721+198e6771e24f
deb http://apt.llvm.org/bullseye/ llvm-toolchain-bullseye main
deb-src http://apt.llvm.org/bullseye/ llvm-toolchain-bullseye main
deb http://apt.llvm.org/bullseye/ llvm-toolchain-bullseye-12 main
deb-src http://apt.llvm.org/bullseye/ llvm-toolchain-bullseye-12 main
deb http://apt.llvm.org/bullseye/ llvm-toolchain-bullseye-13 main
deb-src http://apt.llvm.org/bullseye/ llvm-toolchain-bullseye-13 main
Hello, I clone tunasync-scripts to local by gh, and here I only want to sync part of repos by using bash file insde the script tar ball. For example, here I want to use proxmox.sh to rsync proxmox repos to local OS, how should I set variable TUNASYNC_WORKING_DIR in the script to a customize path. Is there any conf file to define TUNASYNC_WORKING_DIR in a central path? Currently I define the proxmox.sh script, but I don't want to change the structure the scripts pool inside tunasync-scripts fodler.
PVE 7.0基于bullseye,目前镜像只同步到了buster,希望能增加bullseye的同步
根据日志的记录,推测应该是当枚举文件的 python 脚本出现异常不能正常执行完毕时,导致列举的目录列表不全,尚未被列举出的目录和文件随后被同步脚本删除。
目前发现一些镜像更新快,小文件多,无法手工清理旧版本文件,有必要增加扫描并清理文件的功能。
Ref:
According to https://wiki.debian.org/RepositoryFormat#A.22Contents.22_indices , Contents files are located below dists/$DIST/$COMP/
, after debian wheezy.
请教一下,pub.sh中涉及到一个命令/pub-cache/bin/pub_mirror,这个是哪里来的呢?
pub.sh: line 8: /pub-cache/bin/pub_mirror: No such file or directory
When I run aosp.sh , command 'git repack -a -b -d' will give a error like this:
git repack -a -b -d error: unknown switch `b'
I think maybe my git(git version 1.8.3.1) is newer or older than yours, and option '-b' is abandoned.
Why not use 'git gc'? Do you have any other considerations?
A sync tool should be developed to support mirror requests such as:
tuna/issues#549
tuna/issues#335 (comment)
The release list can be fetched from Github API: https://api.github.com/repos/VSCodium/vscodium/releases
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.