implement listers as plugins

Listers are declared as plugins via the `swh.workers` entry_point.

As such, the registry function is expected to return a dict with the
`task_modules` field (as for generic worker plugins), plus:

- `lister`: the lister class,
- `models`: list of SQLAlchemy models used by this lister,
- `init` (optionnal): hook (callable) used to initialize the lister's state
  (typically, create/initialize the database for this lister).
  If not set, the default implementation creates database tables (after
  optionally having deleted exisintg ones) according to models declared in
  the `models` register field.

There is no need for explicitely add lister task modules in the main
`conftest` module, but any new/extra lister to be tested must be registered
(the tested lister module must be properly installed in the test environment).

Also refactor a bit the cli tools:
- add support for the standard --config-file option at the 'lister' group
  level,
- move the --db-url to the 'lister' group,
- drop the --lister option for the `swh lister db-init` cli tool:
  initializing (especially with --drop-tables) the database for a single
  lister is unreliable, since all tables are created using a sibgle MetaData
  (in the same namespace).
This commit is contained in:
David Douard 2019-09-03 15:01:58 +02:00
parent c67a926f26
commit e3c0ea9d90
18 changed files with 279 additions and 216 deletions

View file

@ -0,0 +1,40 @@
# Copyright (C) 2019 the Software Heritage developers
# License: GNU General Public License version 3, or any later version
# See top-level LICENSE file for more information
def debian_init(db_engine, override_conf=None):
from swh.storage.schemata.distribution import (
Distribution, Area)
from .lister import DebianLister
lister = DebianLister(override_config=override_conf)
if not lister.db_session\
.query(Distribution)\
.filter(Distribution.name == 'Debian')\
.one_or_none():
d = Distribution(
name='Debian',
type='deb',
mirror_uri='http://deb.debian.org/debian/')
lister.db_session.add(d)
areas = []
for distribution_name in ['stretch', 'buster']:
for area_name in ['main', 'contrib', 'non-free']:
areas.append(Area(
name='%s/%s' % (distribution_name, area_name),
distribution=d,
))
lister.db_session.add_all(areas)
lister.db_session.commit()
def register():
from .lister import DebianLister
return {'models': [DebianLister.MODEL],
'lister': DebianLister,
'task_modules': ['%s.tasks' % __name__],
'init': debian_init}