Commit graph

35 commits

Author SHA1 Message Date
Antoine R. Dumont (@ardumont)
5b652b3070
lister.debian: Make debian init step idempotent and up-to-date 2019-12-19 13:58:11 +01:00
Antoine R. Dumont (@ardumont)
4a9608f31c
lister/tasks: Standardize return statements
The following commit adapts the return statements from both lister and their
associated tasks. This standardizes on what other modules (e.g. both dvcs and
package loaders) do.
2019-12-02 15:49:38 +01:00
Antoine R. Dumont (@ardumont)
d251201251
debian.models: Migrate tests from storage to debian lister model
Related bb5d405
2019-11-14 10:28:15 +01:00
Nicolas Dandrimont
b2e5ce32a9 Fix bogus NotImplementedError on Area.index_uris 2019-11-13 13:51:46 +01:00
Antoine R. Dumont (@ardumont)
ea7a08d05d
lister.debian: Actually use the db_engine passed to the hook function 2019-11-08 10:51:33 +01:00
Antoine R. Dumont (@ardumont)
e8a67a7650
swh.lister: Remove completely references to swh.storage.schemata
Related to 56d7cff
2019-11-06 15:46:04 +01:00
Antoine R. Dumont (@ardumont)
e0dbca759c
lister.debian: Move run method parameters to constructor 2019-11-05 17:44:45 +01:00
Antoine R. Dumont (@ardumont)
b745c5a735
lister.debian: Default to run a listing on debian distribution
That fixes the `swh lister run --lister debian` cli entrypoint.
2019-11-05 10:35:51 +01:00
Antoine R. Dumont (@ardumont)
a60e0bbc41
lister.debian: Fix task creation
By adding a `retries_left`
2019-11-05 10:35:51 +01:00
Antoine R. Dumont (@ardumont)
f872792407
debian.lister: Send origin url as load-debian task parameter
Instead of the old origin dict. That's what the debian loaders (old and new)
expect.
2019-11-05 10:35:51 +01:00
Antoine R. Dumont (@ardumont)
7c247c8a4a
debian/lister: Use url parameter name instead of origin
within the scheduled task.

Related D2135
2019-11-04 10:00:55 +01:00
Antoine R. Dumont (@ardumont)
56d7cff6e1
debian/model: Install lister model within the lister repository
This is no longer shared between the new debian loader and the lister.

The swh.storage.schemata module is still part of the swh.storage module though.
As this is still a dependency for the current swh.loader.debian production
loader. This will be cleaned up later.

Related D2135
2019-11-04 10:00:54 +01:00
Stefano Zacchiroli
6159faa2f5 mypy: add typing annotations for novel lister abstractions 2019-10-28 15:35:21 +01:00
Nicolas Dandrimont
78105940ff Stop binding tasks to a specific instance of the celery app
The celery.shared_task decorator allows late-binding of tasks to any celery app,
which is well suited for our "task plugin" architecture.
2019-10-18 18:02:25 +02:00
Antoine R. Dumont (@ardumont)
a64ae9641d
debian.lister: Add integration test which checks scheduled tasks
Related T2032
2019-10-15 12:21:24 +02:00
David Douard
b810876ef8 tasks: normalize the url argument name of most lister
Since all the listing tasks accepts an url as first argument (whatever the
argument name is), it makes sense to use a simple common argument name for
this. I've chosen 'url' instead of api_baseurl/forge_url/url.

Also kill now useless `new_lister()` functions.
2019-09-04 15:38:01 +02:00
David Douard
8d9deeb8f8 plugins: add support for scheduler's task-type declaration
Add a new register-task-types cli that will create missing task-type entries in the
scheduler according to:

- only create missing task-types (do not update them), but check that the
  backend_name field is consistent,
- each SWHTask-based task declared in a module listed in the 'task_modules'
  plugin registry field will be checked and added if needed; tasks which name
  start wit an underscore will not be added,
- added task-type will have:
  - the 'type' field is derived from the task's function name (with underscores
    replaced with dashes),
  - the description field is the first line of that function's docstring,
  - default values as provided by the swh.lister.cli.DEFAULT_TASK_TYPE (with
    a simple pattern matching to have decent default values for full/incremental
    tasks),
  - these default values can be overloaded via the 'task_type' plugin registry
    entry.

For this, we had to rename all tasks names (eg. `cran_lister` -> `list_cran`).

Comes with some tests.
2019-09-04 15:36:08 +02:00
David Douard
e3c0ea9d90 implement listers as plugins
Listers are declared as plugins via the `swh.workers` entry_point.

As such, the registry function is expected to return a dict with the
`task_modules` field (as for generic worker plugins), plus:

- `lister`: the lister class,
- `models`: list of SQLAlchemy models used by this lister,
- `init` (optionnal): hook (callable) used to initialize the lister's state
  (typically, create/initialize the database for this lister).
  If not set, the default implementation creates database tables (after
  optionally having deleted exisintg ones) according to models declared in
  the `models` register field.

There is no need for explicitely add lister task modules in the main
`conftest` module, but any new/extra lister to be tested must be registered
(the tested lister module must be properly installed in the test environment).

Also refactor a bit the cli tools:
- add support for the standard --config-file option at the 'lister' group
  level,
- move the --db-url to the 'lister' group,
- drop the --lister option for the `swh lister db-init` cli tool:
  initializing (especially with --drop-tables) the database for a single
  lister is unreliable, since all tables are created using a sibgle MetaData
  (in the same namespace).
2019-09-03 15:02:24 +02:00
Antoine R. Dumont (@ardumont)
b3463ecddc
Drop SWH prefix in classes everywhere
It's redundant with the swh modules in itself.
2019-06-20 19:08:46 +02:00
Antoine R. Dumont (@ardumont)
64a9bc691d
lister.core: Stop creating origins when scheduling tasks
Prior to this commit, lister did create origins as well in the archive. Now, we
only schedule new origins for ingestion.
2019-06-13 15:42:07 +02:00
Antoine R. Dumont (@ardumont)
b81621274b
lister: Unify credentials structure between listers
This becomes a dictionary of key <lister-name>, value a dict of key
<instance-name>, value list of dict username/password.

Related T1772
2019-05-29 14:00:11 +02:00
David Douard
f670de298f Remove debug logging from tasks' code
since this is now handled by the SWHTask itself.
2019-01-17 13:58:29 +01:00
David Douard
e6a4ae7619 flake8: remove unneeded imports 2019-01-15 18:17:20 +01:00
David Douard
f46f3e2015 Remove explicit setting of the task base class
since it's now the default base class in swh-scheduler (>= 0.0.39)
2019-01-10 09:55:17 +01:00
David Douard
33ec762bd4 Add tests for debian tasks 2019-01-08 10:35:33 +01:00
David Douard
0583b0e685 Add a 'ping' task for every lister. 2019-01-08 10:35:33 +01:00
David Douard
2d1f0643ff Heavy refactor of the task system
Get rid of the class based task definition in favor of decorator-based
task declarations.

Doing so, we can get rid of core/tasks.py

Task names are explicitely set to keep compatibility with task
definitions in schedulers' database.

This also add debug statements at the beginning and end of each lister
task.
2019-01-08 10:33:32 +01:00
Antoine R. Dumont (@ardumont)
d88f1b60c9
core/lister: Make the tasks take an explicit lister_args argument
Avoid eating *all* arbitrary arguments and passing them along to the
new_lister method.
2018-07-17 15:48:48 +02:00
Antoine R. Dumont (@ardumont)
4c4aa0ead2
swh.lister: Make LISTER_NAME a class attribute
swh.lister.gitlab: make the 'instance' a constructor parameter
2018-07-11 17:43:41 +02:00
Nicolas Dandrimont
e477a46c60 Add missing __init__.py files
Helps with tests autodetection
2017-10-30 16:38:27 +01:00
Nicolas Dandrimont
6a7b0f802e swh.lister.debian.tasks: add task_queue attribute 2017-10-30 15:08:37 +01:00
Nicolas Dandrimont
458a9e6733 Add a new DebianListerTask 2017-10-30 14:19:43 +01:00
Nicolas Dandrimont
c6e455ce9b debian.lister: use get_packages() method on the snapshot to list packages
Allows to list the packages outside of lister context (e.g. for loader tests)
2017-10-10 16:37:15 +02:00
Nicolas Dandrimont
d2a71ac980 debian.lister: make file size an integer rather than a string 2017-10-10 16:36:58 +02:00
Sushant Sushant
83ebb95705 Add a lister for Debian-like package archives
This work is based on Sushant's internship and D229.
2017-10-04 12:43:09 +02:00