README.md: Fix outdated instructions and improve formatting
This commit is contained in:
parent
977d2459c3
commit
7d192a2f1b
1 changed files with 164 additions and 176 deletions
340
README.md
340
README.md
|
@ -1,19 +1,171 @@
|
|||
SWH-lister
|
||||
============
|
||||
swh-lister
|
||||
==========
|
||||
|
||||
The Software Heritage Lister is both a library module to permit to
|
||||
centralize lister behaviors, and to provide lister implementations.
|
||||
This component from the Software Heritage stack aims to produce listings
|
||||
of software origins and their urls hosted on various public developer platforms
|
||||
or package managers. As these operations are quite similar, it provides a set of
|
||||
Python modules abstracting common software origins listing behaviors.
|
||||
|
||||
Actual lister implementations are:
|
||||
It also provides several lister implementations, contained in the
|
||||
following Python modules:
|
||||
|
||||
- swh-lister-bitbucket
|
||||
- swh-lister-debian
|
||||
- swh-lister-github
|
||||
- swh-lister-gitlab
|
||||
- swh-lister-pypi
|
||||
- `swh.lister.bitbucket`
|
||||
- `swh.lister.debian`
|
||||
- `swh.lister.github`
|
||||
- `swh.lister.gitlab`
|
||||
- `swh.lister.pypi`
|
||||
- `swh.lister.npm`
|
||||
|
||||
Dependencies
|
||||
------------
|
||||
|
||||
All required dependencies can be found in the `requirements*.txt` files located
|
||||
at the root of the repository.
|
||||
|
||||
Local deployment
|
||||
----------------
|
||||
|
||||
## lister configuration
|
||||
|
||||
Each lister implemented so far by Software Heritage (`github`, `gitlab`, `debian`, `pypi`, `npm`)
|
||||
must be configured by following the instructions below (please note that you have to replace
|
||||
`<lister_name>` by one of the lister name introduced above).
|
||||
|
||||
### Preparation steps
|
||||
|
||||
1. `mkdir ~/.config/swh/ ~/.cache/swh/lister/<lister_name>/`
|
||||
2. create configuration file `~/.config/swh/lister_<lister_name>.yml`
|
||||
3. Bootstrap the db instance schema
|
||||
|
||||
```lang=bash
|
||||
$ createdb lister-<lister_name>
|
||||
$ python3 -m swh.lister.cli --db-url postgres:///lister-<lister_name> <lister_name>
|
||||
```
|
||||
|
||||
Note: This bootstraps a minimum data set needed for the lister to run.
|
||||
|
||||
### Configuration file sample
|
||||
|
||||
Minimalistic configuration shared by all listers to add in file `~/.config/swh/lister_<lister_name>.yml`:
|
||||
|
||||
```lang=yml
|
||||
storage:
|
||||
cls: 'remote'
|
||||
args:
|
||||
url: 'http://localhost:5002/'
|
||||
|
||||
scheduler:
|
||||
cls: 'remote'
|
||||
args:
|
||||
url: 'http://localhost:5008/'
|
||||
|
||||
lister:
|
||||
cls: 'local'
|
||||
args:
|
||||
# see http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
|
||||
db: 'postgresql:///lister-<lister_name>'
|
||||
|
||||
credentials: []
|
||||
cache_responses: True
|
||||
cache_dir: /home/user/.cache/swh/lister/<lister_name>/
|
||||
```
|
||||
|
||||
Note: This expects storage (5002) and scheduler (5008) services to run locally
|
||||
|
||||
## lister-github
|
||||
|
||||
Once configured, you can execute a GitHub lister using the following instructions in a `python3` script:
|
||||
|
||||
```lang=python
|
||||
import logging
|
||||
from swh.lister.github.tasks import range_github_lister
|
||||
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
range_github_lister(364, 365)
|
||||
...
|
||||
```
|
||||
|
||||
## lister-gitlab
|
||||
|
||||
Once configured, you can execute a GitLab lister using the instructions detailed in the `python3` scripts below:
|
||||
|
||||
```lang=python
|
||||
import logging
|
||||
from swh.lister.gitlab.tasks import range_gitlab_lister
|
||||
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
range_gitlab_lister(1, 2, {
|
||||
'instance': 'debian',
|
||||
'api_baseurl': 'https://salsa.debian.org/api/v4',
|
||||
'sort': 'asc',
|
||||
'per_page': 20
|
||||
})
|
||||
```
|
||||
|
||||
```lang=python
|
||||
import logging
|
||||
from swh.lister.gitlab.tasks import full_gitlab_relister
|
||||
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
full_gitlab_relister({
|
||||
'instance': '0xacab',
|
||||
'api_baseurl': 'https://0xacab.org/api/v4',
|
||||
'sort': 'asc',
|
||||
'per_page': 20
|
||||
})
|
||||
```
|
||||
|
||||
```lang=python
|
||||
import logging
|
||||
from swh.lister.gitlab.tasks import incremental_gitlab_lister
|
||||
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
incremental_gitlab_lister({
|
||||
'instance': 'freedesktop.org',
|
||||
'api_baseurl': 'https://gitlab.freedesktop.org/api/v4',
|
||||
'sort': 'asc',
|
||||
'per_page': 20
|
||||
})
|
||||
```
|
||||
|
||||
## lister-debian
|
||||
|
||||
Once configured, you can execute a Debian lister using the following instructions in a `python3` script:
|
||||
|
||||
```lang=python
|
||||
import logging
|
||||
from swh.lister.debian.tasks import debian_lister
|
||||
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
debian_lister('Debian')
|
||||
```
|
||||
|
||||
## lister-pypi
|
||||
|
||||
Once configured, you can execute a PyPI lister using the following instructions in a `python3` script:
|
||||
|
||||
```lang=python
|
||||
import logging
|
||||
from swh.lister.pypi.tasks import pypi_lister
|
||||
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
pypi_lister()
|
||||
```
|
||||
|
||||
## lister-npm
|
||||
|
||||
Once configured, you can execute a npm lister using the following instructions in a `python3` REPL:
|
||||
|
||||
```lang=python
|
||||
import logging
|
||||
from swh.lister.npm.tasks import npm_lister
|
||||
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
npm_lister()
|
||||
```
|
||||
|
||||
Licensing
|
||||
----------
|
||||
---------
|
||||
|
||||
This program is free software: you can redistribute it and/or modify it under
|
||||
the terms of the GNU General Public License as published by the Free Software
|
||||
|
@ -25,168 +177,4 @@ WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
|
|||
PARTICULAR PURPOSE. See the GNU General Public License for more details.
|
||||
|
||||
See top-level LICENSE file for the full text of the GNU General Public License
|
||||
along with this program.
|
||||
|
||||
|
||||
Dependencies
|
||||
------------
|
||||
|
||||
- python3
|
||||
- python3-requests
|
||||
- python3-sqlalchemy
|
||||
|
||||
More details in requirements*.txt
|
||||
|
||||
|
||||
Local deployment
|
||||
-----------
|
||||
|
||||
## lister-github
|
||||
|
||||
### Preparation steps
|
||||
|
||||
1. git clone under $SWH_ENVIRONMENT_HOME/swh-lister (of your choosing)
|
||||
2. mkdir ~/.config/swh/ ~/.cache/swh/lister/github.com/
|
||||
3. create configuration file ~/.config/swh/lister-github.com.yml
|
||||
4. Bootstrap the db instance schema
|
||||
|
||||
$ createdb lister-github
|
||||
$ python3 -m swh.lister.cli --db-url postgres:///lister-github github
|
||||
|
||||
### Configuration file sample
|
||||
|
||||
Minimalistic configuration:
|
||||
|
||||
$ cat ~/.config/swh/lister-github.com.yml
|
||||
# see http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
|
||||
lister_db_url: postgres:///lister-github
|
||||
credentials: []
|
||||
cache_responses: True
|
||||
cache_dir: /home/user/.cache/swh/lister/github.com
|
||||
|
||||
Note: This expects storage (5002) and scheduler (5008) services to run locally
|
||||
|
||||
### Run
|
||||
|
||||
$ python3
|
||||
>>> import logging
|
||||
>>> logging.basicConfig(level=logging.DEBUG)
|
||||
>>> from swh.lister.github.tasks import range_github_lister; range_github_lister(364, 365)
|
||||
INFO:root:listing repos starting at 364
|
||||
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.github.com
|
||||
DEBUG:urllib3.connectionpool:https://api.github.com:443 "GET /repositories?since=364 HTTP/1.1" 200 None
|
||||
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost
|
||||
DEBUG:urllib3.connectionpool:http://localhost:5002 "POST /origin/add HTTP/1.1" 200 1
|
||||
|
||||
|
||||
## lister-gitlab
|
||||
|
||||
### preparation steps
|
||||
|
||||
1. git clone under $SWH_ENVIRONMENT_HOME/swh-lister (of your choosing)
|
||||
2. mkdir ~/.config/swh/ ~/.cache/swh/lister/gitlab/
|
||||
3. create configuration file ~/.config/swh/lister-gitlab.yml
|
||||
4. Bootstrap the db instance schema
|
||||
|
||||
$ createdb lister-gitlab
|
||||
$ python3 -m swh.lister.cli --db-url postgres:///lister-gitlab gitlab
|
||||
|
||||
### Configuration file sample
|
||||
|
||||
$ cat ~/.config/swh/lister-gitlab.yml
|
||||
# see http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
|
||||
lister_db_url: postgres:///lister-gitlab
|
||||
credentials: []
|
||||
cache_responses: True
|
||||
cache_dir: /home/user/.cache/swh/lister/gitlab
|
||||
|
||||
Note: This expects storage (5002) and scheduler (5008) services to run locally
|
||||
|
||||
### Run
|
||||
|
||||
$ python3
|
||||
Python 3.6.6 (default, Jun 27 2018, 14:44:17)
|
||||
[GCC 8.1.0] on linux
|
||||
Type "help", "copyright", "credits" or "license" for more information.
|
||||
>>> from swh.lister.gitlab.tasks import range_gitlab_lister; range_gitlab_lister(1, 2,
|
||||
{'instance': 'debian', 'api_baseurl': 'https://salsa.debian.org/api/v4', 'sort': 'asc', 'per_page': 20})
|
||||
>>> from swh.lister.gitlab.tasks import full_gitlab_relister; full_gitlab_relister(
|
||||
{'instance':'0xacab', 'api_baseurl':'https://0xacab.org/api/v4', 'sort': 'asc', 'per_page': 20})
|
||||
>>> from swh.lister.gitlab.tasks import incremental_gitlab_lister; incremental_gitlab_lister(
|
||||
{'instance': 'freedesktop.org', 'api_baseurl': 'https://gitlab.freedesktop.org/api/v4',
|
||||
'sort': 'asc', 'per_page': 20})
|
||||
|
||||
## lister-debian
|
||||
|
||||
### preparation steps
|
||||
|
||||
1. git clone under $SWH_ENVIRONMENT_HOME/swh-lister (of your choosing)
|
||||
2. mkdir ~/.config/swh/ ~/.cache/swh/lister/debian/
|
||||
3. create configuration file ~/.config/swh/lister-debian.yml
|
||||
4. Bootstrap the db instance schema
|
||||
|
||||
$ createdb lister-debian
|
||||
$ python3 -m swh.lister.cli --db-url postgres:///lister-debian debian
|
||||
|
||||
Note: This bootstraps a minimum data set needed for the debian
|
||||
lister to run (for development)
|
||||
|
||||
### Configuration file sample
|
||||
|
||||
$ cat ~/.config/swh/lister-debian.yml
|
||||
# see http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
|
||||
lister_db_url: postgres:///lister-debian
|
||||
credentials: []
|
||||
cache_responses: True
|
||||
cache_dir: /home/user/.cache/swh/lister/debian
|
||||
|
||||
Note: This expects storage (5002) and scheduler (5008) services to run locally
|
||||
|
||||
### Run
|
||||
|
||||
$ python3
|
||||
Python 3.6.6 (default, Jun 27 2018, 14:44:17)
|
||||
[GCC 8.1.0] on linux
|
||||
Type "help", "copyright", "credits" or "license" for more information.
|
||||
>>> import logging; logging.basicConfig(level=logging.DEBUG); from swh.lister.debian.tasks import debian_lister; debian_lister('Debian')
|
||||
DEBUG:root:Creating snapshot for distribution Distribution(Debian (deb) on http://deb.debian.org/debian/) on date 2018-07-27 09:22:50.461165+00:00
|
||||
DEBUG:root:Processing area Area(stretch/main of Debian)
|
||||
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): deb.debian.org
|
||||
DEBUG:urllib3.connectionpool:http://deb.debian.org:80 "GET /debian//dists/stretch/main/source/Sources.xz HTTP/1.1" 302 325
|
||||
...
|
||||
|
||||
|
||||
## lister-pypi
|
||||
|
||||
### preparation steps
|
||||
|
||||
1. git clone under $SWH_ENVIRONMENT_HOME/swh-lister (of your choosing)
|
||||
2. mkdir ~/.config/swh/ ~/.cache/swh/lister/pypi/
|
||||
3. create configuration file ~/.config/swh/lister-pypi.yml
|
||||
4. Bootstrap the db instance schema
|
||||
|
||||
$ createdb lister-pypi
|
||||
$ python3 -m swh.lister.cli --db-url postgres:///lister-pypi pypi
|
||||
|
||||
Note: This bootstraps a minimum data set needed for the pypi
|
||||
lister to run (for development)
|
||||
|
||||
### Configuration file sample
|
||||
|
||||
$ cat ~/.config/swh/lister-pypi.yml
|
||||
# see http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
|
||||
lister_db_url: postgres:///lister-pypi
|
||||
credentials: []
|
||||
cache_responses: True
|
||||
cache_dir: /home/user/.cache/swh/lister/pypi
|
||||
|
||||
Note: This expects storage (5002) and scheduler (5008) services to run locally
|
||||
|
||||
### Run
|
||||
|
||||
$ python3
|
||||
Python 3.6.6 (default, Jun 27 2018, 14:44:17)
|
||||
[GCC 8.1.0] on linux
|
||||
Type "help", "copyright", "credits" or "license" for more information.
|
||||
>>> from swh.lister.pypi.tasks import pypi_lister; pypi_lister()
|
||||
>>>
|
||||
along with this program.
|
Loading…
Add table
Add a link
Reference in a new issue