Giter Club home page Giter Club logo

hackage-mirror-tool's Introduction

Hackage mirroring tool Build Status

This is a simple tool for mirroring to S3-compatible object stores (e.g. Dreamhost or AWS).

See also hackage-mirror-tool --help.

Resource requirements

Currently, using this tool to operate a http://hackage.haskell.org mirror has the following requirements:

  • ~1 GiB local filesystem storage (used for by local 01-index.tar cache)
  • ~10 GiB of storage in S3 bucket (at time of writing ~7.1 GiB were needed, this size increases monotonoically over time)
  • A single-threaded hackage-mirror-tool run needs (less than) ~256 MiB RAM; IOW, a small 512 MiB RAM VM configuration suffices.

Example usages

cronjob-based

This is a simple example for how to set up a cronjob-based mirror job, which is triggered every 3 minutes.

Create the following cronjob(5) entry:

*/3 * * * *  ${HOME}/bin/run_mirror_job.sh

The ${HOME}/bin/run_mirror_job.sh script contains:

#!/bin/bash

mkdir -p ${HOME}/workdir/logs
cd ${HOME}/workdir/

S3_ACCESS_KEY="ASJKDS..." \
S3_SECRET_KEY="asdjhakjsdhadhadjhaljkdh..." \
timeout -k5 170 ${HOME}/bin/hackage-mirror-tool +RTS -t -A2M -M256M -RTS \
  --hackage-url      http://hackage.haskell.org \
  --hackage-pkg-url  http://hackage.haskell.org/package/ \
  --s3-base-url      https://s3.amazonaws.com \
  --s3-bucket-id     my-hackage-mirror \
   &>> ${HOME}/workdir/logs/$(date -I).log

The timeout -k5 170 arguments are defined that way in order to ensure that the current job is killed before the next cronjob gets started.

Sample AWS access policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "bucketlevel",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::hackage-mirror-tool"
            ]
        },
        {
            "Sid": "objectlevel",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": [
                "arn:aws:s3:::hackage-mirror-tool/*"
            ]
        }
    ]
}

hackage-mirror-tool's People

Contributors

hvr avatar snoyberg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

hackage-mirror-tool's Issues

Irrefutable pattern match

2016-09-19 15:17:40.583Z [   15] *INFO* Downloading timestamp
2016-09-19 15:17:40.707Z [   15] *INFO* Downloading snapshot
2016-09-19 15:17:40.837Z [   15] *INFO* Downloading mirrors
2016-09-19 15:17:40.956Z [   15] *WARNING* Cannot update index (no local copy)
2016-09-19 15:17:40.956Z [   15] *INFO* Downloading index
2016-09-19 15:18:38.458Z [   15] *INFO* index changed
2016-09-19 15:18:38.530Z [   15] *INFO* fetching meta-data file objects from S3...
2016-09-19 15:18:39.202Z [   15] *CRITICAL* exception: src/SimpleS3.hs:137:9-65: Irrefutable pattern failed for pattern Just conts

CRITICAL exception: fixupIdx "gogol-admin-reports-0.2.0.tar.gz"

Full output from the run, not sure what it means:

2017-02-13 13:36:16.718Z [    7] *INFO* Selected mirror http://104.130.241.19
2017-02-13 13:36:16.720Z [    7] *INFO* Downloading timestamp
2017-02-13 13:36:16.837Z [    7] *INFO* Downloading snapshot
2017-02-13 13:36:16.951Z [    7] *INFO* Downloading mirrors
2017-02-13 13:36:17.062Z [    7] *WARNING* Cannot update index (no local copy)
2017-02-13 13:36:17.062Z [    7] *INFO* Downloading index
2017-02-13 13:36:34.820Z [    7] *INFO* index changed
2017-02-13 13:36:34.820Z [    7] *INFO* fetching meta-data file objects from S3...
2017-02-13 13:36:45.894Z [    7] *CRITICAL* exception: fixupIdx "gogol-admin-reports-0.2.0.tar.gz"
CallStack (from HasCallStack):
  error, called at src/IndexShaSum.hs:158:16 in main:IndexShaSum
<<ghc: 9288958256 bytes, 4045 GCs, 27340237/82106784 avg/max bytes residency (13 samples), 193M in use, 0.000 INIT (0.000 elapsed), 14.432 MUT (27.941 elapsed), 1.168 GC (1.372 elapsed) :ghc>>

Any insights?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.