repology / repology-updater Goto Github PK
View Code? Open in Web Editor NEWRepology backend service to update repository and package data
Home Page: https://repology.org
License: GNU General Public License v3.0
Repology backend service to update repository and package data
Home Page: https://repology.org
License: GNU General Public License v3.0
Need a way to make specific repositories not produce lonely packages. This will allow incompatible repositories which have too many packages which do not match with other repos to still take part in comparison. Useful for OpenSUSE as long as it's based on binary package lists and non-unix repositories like F-Droid and Chocolatey which have too many non-portable projects.
Some repos include patches, it would be nice to report them upstream.
Code is there, need an easy way to download all .spec files. Todo: concact sisyphus guys
Keep transformed name in the process, pass it from rule to rule; pass
terminates (rename to stop
)
Since repositories are independent, this should be easy to fetch and parse them in parallel. And it will really be useful for slow repositories such as pkgsrc (index generation) and Fedora (slow fetching)
E.g. darker red color
wget -> python requests; gzip/bzip?
Currently we use list of packages instead of parsing pkgsrc, because the latter process (make index
) is too slow. Not sure what we can do here now though.
Failing to fetch single repository shouldn't interrupt the whole fetch, it should only stop single repository from updating. Also, failure to fetch most repositories should remove their state, so incomplete state is not parsed.
Merge maintainers case-insensitively, such as:
Debian JavaScript Maintainers <[email protected]>
Debian Javascript Maintainers <[email protected]>
ver
, verpat
and setver
The list is too long otherwise
E.g. debian uses fake versions for perl modules: these should be ignored not to clobber other distros.
Make versions link to packages in specific repos (or VCS pages)
Currenly, each metapackage only allow single package per repo (e.g. only one package for FreeBSD). This leads to shadowing and information loss (when e.g. php55, php56, php70 are merged into a single metapackage). Allow multiple packages per repo, always take highest version, but leave all information there.
This depends on some refactoring though
E.g. will draw repository names in table header with gray and display a tooltip, to show users package data for specific repository is incomplete.
Before proper dynamic backend is developed, let's just make a static site generator. Required features:
--tag foo,bar --tag baz
== (foo OR bar) AND baz
Packages are named differently across repos. Need rules to merge differently named packages into single entity. Need single package rules (extreme-tuxracer + extremetuxracer) as well as generic rules (FreeBSD: p5-Foo-Bar, Debian libfoo-bar-perl).
E.g. 20[01][0-9]\.?[01][0-9]\.?[0123][0-9]
What to do with this:
abcmidi
package problem) to fix comparison0.0.20160916
vs. git20160916
vs. 2016.09.16
And right-align package names?
To make pagination more usable, it needs to work with package names (e.g. aa..ak, ak..bc instead of 1, 2)
Useful for testing
Fetchable via web: https://admin.fedoraproject.org/pkgdb/packages/
Available for testing in newrepos
branch. Still TODO:
Package: %foovar%
)%description
)autossh: 1.4c > 1.4e ???
lsof: 4.89 > 4.90.f ???
Need to be split: btf (dev-java and sci-libs)
Actually, a lot more:
find gentoo.git -type d -maxdepth 2 -mindepth 2 |
egrep -v 'dev-(perl|python|haskell)' |
awk -F/ '{print $NF}' |
sort | uniq -d
ace acl ada amap analog apel atlas attica auctex baloo balsa barcode bbdb bfm binclock bluedevil bson btf build c-support calc calendar cdcover cdrtools charm checkpassword coffee-script color crystal csv daemontools dash dictionary dirdiff docker dolphin ebuild-mode ecb eject elib emacs ess exo fam fcgi ffmpeg fuse gambit gdl git glade glu gnupg gnuplot gom gpgme grip gsasl haskell-mode highline icecream igrep info jack jal jama jde jpeg json kactivities kde-gtk-config kdeplasma-addons kdesu kfilemetadata kglobalaccel khotkeys kinfocenter kmenuedit knewstuff krunner kscreen kstart ksysguard kwin kwrited languagetool launchy lemon libelf libffi libgudev libiconv libintl libkscreen libnet libusb locale lookup lzma magic mailcrypt mailx man mars mash mavros mc mediawiki mew milou mime-types mldonkey mmix mmm-mode modutils mongo mpack mpc msgpack muse mysql nagios nemesis ninja nitrogen notification-daemon nut ocaml openmsx otter pam par pcl pdv picard pkgconfig planner plasma-mediacenter plasma-nm plasma-workspace pmake pms polkit-kde-agent polyglot powerdevil psgml psi python-mode rails re2 redis reduce riece ruby rubygems screen session shadow signify silo ski skkserv slim slurm smack sml-mode snappy spice spin splat splice sqlite3 ssh surf systemsettings szip teco texinfo tf time tokyocabinet tornado tree uclibc udev vc vm w3m xclip xslide xsp yacc zenburn zenirc
How do I download a single-file AUR index (the one pacman uses?) or whole AUR as a single repository? No idea for now. As last resort, AUR website may be scrapped.
Please share any ideas on what additional repositories we can support. A description on how to fetch all package data from specific repository is preferred. Approved repositories with determined fetching algorithm are split to separate bugs and eventually implemented.
From Fedora release-monitoring
Unsorted
Since these will contain too many unrelevant unique packages, doable as shadow repos:
Doable as shadow repos as well
...more ideas?
This is of course planned with proper backend.
Mark vulnerable package versions
The plan:
cpe_name
(is useless without vendor)cpe_vendor
/cpe_product
If package was removed, it's probably problematic and there's little reason bringing it back. For FreeBSD, may use MOVED file
At least for rules processing and version comparison done, now complete unit test is needed, which includes parsing and processing fake repository data.
Instead of polluting stderr
Available for testing in newrepos branch. Uses binary package lists, so partially unsuitable for comparison, will be implemented as a shadow repo. Also contains too little info. Investigate a possibility of fetching complete data.
Non-matching rule may indicate a bug or a need for update
These are not tied together. Fetching is more generic and may be common to multiple parsing techniques. Currently, there are just 2 types of fetchers:
Track package state changes (such as version updates, new packages). Provide RSS streams of such events, display icons for recently updated/added packages.
For named package, generate a badge with repositories it's present in with version info
Fetcher is rather trivial: get https://chocolatey.org/api/v2/Packages()?$filter=IsLatestVersion
, parse XML, get next page from <link rel="next" href="">
. All package info is available in XML, including name, version, tags, comment, author, www.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.