No description
Find a file
2019-03-24 10:54:12 +05:30
bin use /usr/bin/env python3 as shebang, to be nice to virtualenv 2018-09-19 17:17:51 +02:00
docs doc: update index to match new swh-doc format 2018-11-23 13:50:51 +01:00
sql Revert to the pre-qless refactoring version 2016-09-13 14:57:26 +02:00
swh Removed unnecessary files 2019-03-20 00:45:30 +05:30
.gitignore Add a tox file 2018-10-29 10:11:11 +01:00
ACKNOWLEDGEMENTS add ACKNOWLEDGEMENTS 2015-04-26 15:54:25 +02:00
LICENSE add license information 2015-04-26 16:24:32 +02:00
Makefile Makefile: add from swh-py-template 2015-10-27 14:35:54 +01:00
MANIFEST.in debian: Fix tests during debian packaging 2018-07-18 14:15:09 +02:00
pytest.ini Don't run pytest in the docs directory 2018-10-23 16:48:14 +02:00
README.md Updated toplevel function names in README 2019-03-24 10:54:12 +05:30
requirements-swh.txt Bump dependencies 2019-02-06 15:38:11 +01:00
requirements-test.txt update dep on swh-scheduler>0.0.39 and pytest<4 (tests) 2019-01-16 16:39:03 +01:00
requirements.txt Fix the bitbucket lister; handle properly the date-like bounds 2019-02-01 15:38:11 +01:00
setup.py Add an entry point for the cli command 2019-02-06 10:21:50 +01:00
tox.ini Add a tox file 2018-10-29 10:11:11 +01:00

SWH-lister

The Software Heritage Lister is both a library module to permit to centralize lister behaviors, and to provide lister implementations.

Actual lister implementations are:

  • swh-lister-bitbucket
  • swh-lister-debian
  • swh-lister-github
  • swh-lister-gitlab
  • swh-lister-pypi

Licensing

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

See top-level LICENSE file for the full text of the GNU General Public License along with this program.

Dependencies

  • python3
  • python3-requests
  • python3-sqlalchemy

More details in requirements*.txt

Local deployment

lister-github

Preparation steps

  1. git clone under $SWH_ENVIRONMENT_HOME/swh-lister (of your choosing)

  2. mkdir ~/.config/swh/ ~/.cache/swh/lister/github.com/

  3. create configuration file ~/.config/swh/lister-github.com.yml

  4. Bootstrap the db instance schema

    $ createdb lister-github $ python3 -m swh.lister.cli --db-url postgres:///lister-github github

Configuration file sample

Minimalistic configuration:

$ cat ~/.config/swh/lister-github.com.yml
# see http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
lister_db_url: postgres:///lister-github
credentials: []
cache_responses: True
cache_dir: /home/user/.cache/swh/lister/github.com

Note: This expects storage (5002) and scheduler (5008) services to run locally

Run

$ python3
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>> from swh.lister.github.tasks import range_github_lister; range_github_lister(364, 365)
INFO:root:listing repos starting at 364
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.github.com
DEBUG:urllib3.connectionpool:https://api.github.com:443 "GET /repositories?since=364 HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost
DEBUG:urllib3.connectionpool:http://localhost:5002 "POST /origin/add HTTP/1.1" 200 1

lister-gitlab

preparation steps

  1. git clone under $SWH_ENVIRONMENT_HOME/swh-lister (of your choosing)

  2. mkdir ~/.config/swh/ ~/.cache/swh/lister/gitlab/

  3. create configuration file ~/.config/swh/lister-gitlab.yml

  4. Bootstrap the db instance schema

    $ createdb lister-gitlab $ python3 -m swh.lister.cli --db-url postgres:///lister-gitlab gitlab

Configuration file sample

$ cat ~/.config/swh/lister-gitlab.yml
# see http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
lister_db_url: postgres:///lister-gitlab
credentials: []
cache_responses: True
cache_dir: /home/user/.cache/swh/lister/gitlab

Note: This expects storage (5002) and scheduler (5008) services to run locally

Run

$ python3
Python 3.6.6 (default, Jun 27 2018, 14:44:17)
[GCC 8.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from swh.lister.gitlab.tasks import range_gitlab_lister; range_gitlab_lister(1, 2,
  {'instance': 'debian', 'api_baseurl': 'https://salsa.debian.org/api/v4', 'sort': 'asc', 'per_page': 20})
>>> from swh.lister.gitlab.tasks import full_gitlab_relister; full_gitlab_relister(
  {'instance':'0xacab', 'api_baseurl':'https://0xacab.org/api/v4', 'sort': 'asc', 'per_page': 20})
>>> from swh.lister.gitlab.tasks import incremental_gitlab_lister; incremental_gitlab_lister(
  {'instance': 'freedesktop.org', 'api_baseurl': 'https://gitlab.freedesktop.org/api/v4',
   'sort': 'asc', 'per_page': 20})

lister-debian

preparation steps

  1. git clone under $SWH_ENVIRONMENT_HOME/swh-lister (of your choosing)

  2. mkdir ~/.config/swh/ ~/.cache/swh/lister/debian/

  3. create configuration file ~/.config/swh/lister-debian.yml

  4. Bootstrap the db instance schema

    $ createdb lister-debian $ python3 -m swh.lister.cli --db-url postgres:///lister-debian debian

    Note: This bootstraps a minimum data set needed for the debian lister to run (for development)

Configuration file sample

$ cat ~/.config/swh/lister-debian.yml
# see http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
lister_db_url: postgres:///lister-debian
credentials: []
cache_responses: True
cache_dir: /home/user/.cache/swh/lister/debian

Note: This expects storage (5002) and scheduler (5008) services to run locally

Run

$ python3 Python 3.6.6 (default, Jun 27 2018, 14:44:17) [GCC 8.1.0] on linux Type "help", "copyright", "credits" or "license" for more information.

import logging; logging.basicConfig(level=logging.DEBUG); from swh.lister.debian.tasks import debian_lister; debian_lister('Debian') DEBUG:root:Creating snapshot for distribution Distribution(Debian (deb) on http://deb.debian.org/debian/) on date 2018-07-27 09:22:50.461165+00:00 DEBUG:root:Processing area Area(stretch/main of Debian) DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): deb.debian.org DEBUG:urllib3.connectionpool:http://deb.debian.org:80 "GET /debian//dists/stretch/main/source/Sources.xz HTTP/1.1" 302 325 ...

lister-pypi

preparation steps

  1. git clone under $SWH_ENVIRONMENT_HOME/swh-lister (of your choosing)

  2. mkdir ~/.config/swh/ ~/.cache/swh/lister/pypi/

  3. create configuration file ~/.config/swh/lister-pypi.yml

  4. Bootstrap the db instance schema

    $ createdb lister-pypi $ python3 -m swh.lister.cli --db-url postgres:///lister-pypi pypi

    Note: This bootstraps a minimum data set needed for the pypi lister to run (for development)

Configuration file sample

$ cat ~/.config/swh/lister-pypi.yml
# see http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
lister_db_url: postgres:///lister-pypi
credentials: []
cache_responses: True
cache_dir: /home/user/.cache/swh/lister/pypi

Note: This expects storage (5002) and scheduler (5008) services to run locally

Run

$ python3 Python 3.6.6 (default, Jun 27 2018, 14:44:17) [GCC 8.1.0] on linux Type "help", "copyright", "credits" or "license" for more information.

from swh.lister.pypi.tasks import pypi_lister; pypi_lister()