Giter Club home page Giter Club logo

clojars-poms's Introduction

Open Issues GitHub last commit License

clojars-poms

A little tool that started out as a way to explore the dependencies between projects deployed to Clojars, but is now a more generally useful tool for analysing the POMs of projects deployed to Clojars.

Installation

For now the code isn't deployed anywhere, so best to clone this repo, then take a look at the source.

If you've installed the Clojure CLI tools you can run the sample script and be dropped in a REPL via:

$ clojure -i repl-init.clj -r

Note: the repl-init.clj script uses the spinner library, which isn't compatible with the clj command line script.

The first time this script is run it will pull down all POMs from clojars.org and cache them locally, which can take an hour or more depending on your network connection. As of mid 2023, this is ~265,000 POM files (and the same number of metadata files for caching purposes) totalling ~2.4GB. On subsequent runs it will be a lot faster (especially if prevent-sync is set to true!), as it uses etag requests to Clojars to only pull what's new or modified.

Please limit how often you re-sync poms from Clojars! They provide a wonderful service to the Clojure community for free, but someone is paying for their bandwidth and those folks deserve our respect!

Look at repl-init.clj for more details on what the script sets up and how you can experiment with this data.

Developer Information

GitHub project

Bug Tracker

License

Copyright © 2019 Peter Monks ([email protected])

Distributed under the Apache License, Version 2.0.

SPDX-License-Identifier: Apache-2.0

clojars-poms's People

Contributors

pmonks avatar

Stargazers

Dominic Monroe avatar

Watchers

James Cloos avatar  avatar

clojars-poms's Issues

Re-implement POM syncing logic

In 2019, clojars dropped support for rsync, breaking this code. It looks as though the only alternative is to use the list of POMs Clojars API then pull down every single POM file via many (!) individual HTTP requests (a classic N+1 problem).

Efficient(ish) "delta syncing" on subsequent requests can be achieved by storing ETags alongside each POM file, and then using If-None-Match headers on subsequent requests. While not as efficient overall as the rsync approach (it doesn't address the overarching N+1 problem), it should still substantially save on bandwidth.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.