It enables to track last lister execution date and will be used to schedule
first visits with high priority for listed origins.
Related to swh/devel/swh-scheduler#4687.
In order to simplify the testing of listers, allow to call the run command
of swh-lister CLI without scheduler configuration. In that case a temporary
scheduler instance with a postgresql backend is created and used.
It enables to easily test a lister with the following command:
$ swh -l DEBUG lister run <lister_name> url=<forge_url>
That means detected github urls {https,git,http}://github.com/${user_repo}(.git) are
canonicalized to https://github.com/${user_repo} format.
This avoids duplication of origins.
Related to T4232
This is no longer shared between the new debian loader and the lister.
The swh.storage.schemata module is still part of the swh.storage module though.
As this is still a dependency for the current swh.loader.debian production
loader. This will be cleaned up later.
Related D2135
Listers are declared as plugins via the `swh.workers` entry_point.
As such, the registry function is expected to return a dict with the
`task_modules` field (as for generic worker plugins), plus:
- `lister`: the lister class,
- `models`: list of SQLAlchemy models used by this lister,
- `init` (optionnal): hook (callable) used to initialize the lister's state
(typically, create/initialize the database for this lister).
If not set, the default implementation creates database tables (after
optionally having deleted exisintg ones) according to models declared in
the `models` register field.
There is no need for explicitely add lister task modules in the main
`conftest` module, but any new/extra lister to be tested must be registered
(the tested lister module must be properly installed in the test environment).
Also refactor a bit the cli tools:
- add support for the standard --config-file option at the 'lister' group
level,
- move the --db-url to the 'lister' group,
- drop the --lister option for the `swh lister db-init` cli tool:
initializing (especially with --drop-tables) the database for a single
lister is unreliable, since all tables are created using a sibgle MetaData
(in the same namespace).