![]() Instead of using an undocumented rubygems HTTP endpoint that only gives us the names of the gems, prefer to exploit the daily PostgreSQL dump of the rubygems.org database. It enables to list all gems but also all versions of a gem and its release artifacts. For each relase artifact, the following info are extracted: version, download URL, sha256 checksum, release date plus a couple of extra metadata. The lister will now set list of artifacts and list of metadata as extra loader arguments when sending a listed origin to the scheduler database. A last_update date is also computed which should ensure loading tasks for rubygems will be scheduled only when new releases are available since last loadings. To be noted, the lister will spawn a temporary postgres instance so this require the initdb executable from postgres server installation to be available in the execution environment. Related to T1777 |
||
---|---|---|
docs | ||
sql | ||
swh | ||
.git-blame-ignore-revs | ||
.gitignore | ||
.pre-commit-config.yaml | ||
ACKNOWLEDGEMENTS | ||
CODE_OF_CONDUCT.md | ||
conftest.py | ||
CONTRIBUTORS | ||
LICENSE | ||
Makefile | ||
MANIFEST.in | ||
mypy.ini | ||
pyproject.toml | ||
pytest.ini | ||
README.md | ||
requirements-swh.txt | ||
requirements-test.txt | ||
requirements.txt | ||
setup.cfg | ||
setup.py | ||
tox.ini |
swh-lister
This component from the Software Heritage stack aims to produce listings of software origins and their urls hosted on various public developer platforms or package managers. As these operations are quite similar, it provides a set of Python modules abstracting common software origins listing behaviors.
It also provides several lister implementations, contained in the following Python modules:
swh.lister.bitbucket
swh.lister.cgit
swh.lister.cran
swh.lister.debian
swh.lister.gitea
swh.lister.github
swh.lister.gitlab
swh.lister.gnu
swh.lister.golang
swh.lister.launchpad
swh.lister.maven
swh.lister.npm
swh.lister.packagist
swh.lister.phabricator
swh.lister.pypi
swh.lister.tuleap
swh.lister.gogs
Dependencies
All required dependencies can be found in the requirements*.txt
files located
at the root of the repository.
Local deployment
lister configuration
Each lister implemented so far by Software Heritage (bitbucket
, cgit
, cran
, debian
,
gitea
, github
, gitlab
, gnu
, golang
, launchpad
, npm
, packagist
, phabricator
, pypi
, tuleap
, maven
)
must be configured by following the instructions below (please note that you have to replace
<lister_name>
by one of the lister name introduced above).
Preparation steps
mkdir ~/.config/swh/
- create configuration file
~/.config/swh/listers.yml
Configuration file sample
Minimalistic configuration shared by all listers to add in file ~/.config/swh/listers.yml
:
scheduler:
cls: 'remote'
args:
url: 'http://localhost:5008/'
credentials: {}
Note: This expects scheduler (5008) service to run locally
Executing a lister
Once configured, a lister can be executed by using the swh
CLI tool with the
following options and commands:
$ swh --log-level DEBUG lister -C ~/.config/swh/listers.yml run --lister <lister_name> [lister_parameters]
Examples:
$ swh --log-level DEBUG lister -C ~/.config/swh/listers.yml run --lister bitbucket
$ swh --log-level DEBUG lister -C ~/.config/swh/listers.yml run --lister cran
$ swh --log-level DEBUG lister -C ~/.config/swh/listers.yml run --lister gitea url=https://codeberg.org/api/v1/
$ swh --log-level DEBUG lister -C ~/.config/swh/listers.yml run --lister gitlab url=https://salsa.debian.org/api/v4/
$ swh --log-level DEBUG lister -C ~/.config/swh/listers.yml run --lister npm
$ swh --log-level DEBUG lister -C ~/.config/swh/listers.yml run --lister pypi
Licensing
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
See top-level LICENSE file for the full text of the GNU General Public License along with this program.