Commit graph

23 commits

Author SHA1 Message Date
Antoine R. Dumont (@ardumont)
0560b813b2
cran.lister: Adapt docstring sample accordingly 2020-01-09 10:54:49 +01:00
Antoine R. Dumont (@ardumont)
e1069f0c59
cran.lister: Align loading tasks' with loader's expectation 2020-01-09 10:18:51 +01:00
Antoine R. Dumont (@ardumont)
3f3f714c62
cran.lister: Move helper function to the bottom of the file 2020-01-06 16:55:23 +01:00
Antoine R. Dumont (@ardumont)
4a9608f31c
lister/tasks: Standardize return statements
The following commit adapts the return statements from both lister and their
associated tasks. This standardizes on what other modules (e.g. both dvcs and
package loaders) do.
2019-12-02 15:49:38 +01:00
Antoine R. Dumont (@ardumont)
cb853f4898
lister.tests: Avoid duplication setup step
And remove unnecessary fixture redefinition which causes indirection.
2019-11-21 14:24:01 +01:00
David Douard
3ddfd00e90 Fix typos (and trailing ws) reported by codespell 2019-11-21 14:11:18 +01:00
Antoine R. Dumont (@ardumont)
1757b0112b
cran/gnu: Rename task_type to load-archive-files 2019-11-21 13:44:20 +01:00
Antoine R. Dumont (@ardumont)
1cf7c8e86b
lister.tests: Add missing task_type for package listers
The scheduler module no longer initializes itself those task_type.
2019-11-21 13:44:20 +01:00
Stefano Zacchiroli
44bc4462e7 CRAN lister: fix compute_package_url interpolation 2019-10-28 15:50:23 +01:00
Stefano Zacchiroli
6159faa2f5 mypy: add typing annotations for novel lister abstractions 2019-10-28 15:35:21 +01:00
Stefano Zacchiroli
7dfd811e16 CRAN lister: make shelling out decoding compatible with Python 3.5 2019-10-28 15:35:21 +01:00
Stefano Zacchiroli
974f80f966 typing: minimal changes to make a no-op mypy run pass 2019-10-28 15:35:21 +01:00
Nicolas Dandrimont
78105940ff Stop binding tasks to a specific instance of the celery app
The celery.shared_task decorator allows late-binding of tasks to any celery app,
which is well suited for our "task plugin" architecture.
2019-10-18 18:02:25 +02:00
Antoine R. Dumont (@ardumont)
8d50e0d941
cran.lister: Fix cran lister and add proper integration test
Which checks the cran lister tasks written in the scheduler.

Related d30d574dbe
Related 5ea9d5ed39

Related T2032
2019-10-11 13:19:22 +02:00
Antoine R. Dumont (@ardumont)
04ca318680
simple_lister: Extract common behavior in base class 2019-10-09 17:35:12 +02:00
Antoine R. Dumont (@ardumont)
d30d574dbe
cran.lister: Refactor and fix cran lister
Prior to this commit, the code was actually duplicated with an old version
which would not work.

Related D1492#41287
2019-10-02 11:06:59 +02:00
David Douard
8d9deeb8f8 plugins: add support for scheduler's task-type declaration
Add a new register-task-types cli that will create missing task-type entries in the
scheduler according to:

- only create missing task-types (do not update them), but check that the
  backend_name field is consistent,
- each SWHTask-based task declared in a module listed in the 'task_modules'
  plugin registry field will be checked and added if needed; tasks which name
  start wit an underscore will not be added,
- added task-type will have:
  - the 'type' field is derived from the task's function name (with underscores
    replaced with dashes),
  - the description field is the first line of that function's docstring,
  - default values as provided by the swh.lister.cli.DEFAULT_TASK_TYPE (with
    a simple pattern matching to have decent default values for full/incremental
    tasks),
  - these default values can be overloaded via the 'task_type' plugin registry
    entry.

For this, we had to rename all tasks names (eg. `cran_lister` -> `list_cran`).

Comes with some tests.
2019-09-04 15:36:08 +02:00
David Douard
e3c0ea9d90 implement listers as plugins
Listers are declared as plugins via the `swh.workers` entry_point.

As such, the registry function is expected to return a dict with the
`task_modules` field (as for generic worker plugins), plus:

- `lister`: the lister class,
- `models`: list of SQLAlchemy models used by this lister,
- `init` (optionnal): hook (callable) used to initialize the lister's state
  (typically, create/initialize the database for this lister).
  If not set, the default implementation creates database tables (after
  optionally having deleted exisintg ones) according to models declared in
  the `models` register field.

There is no need for explicitely add lister task modules in the main
`conftest` module, but any new/extra lister to be tested must be registered
(the tested lister module must be properly installed in the test environment).

Also refactor a bit the cli tools:
- add support for the standard --config-file option at the 'lister' group
  level,
- move the --db-url to the 'lister' group,
- drop the --lister option for the `swh lister db-init` cli tool:
  initializing (especially with --drop-tables) the database for a single
  lister is unreliable, since all tables are created using a sibgle MetaData
  (in the same namespace).
2019-09-03 15:02:24 +02:00
Antoine R. Dumont (@ardumont)
87d2a16df0
listers: Allow to override policy and priority for scheduled tasks
Prior to this commit, the policy and priority were hard-coded.
The default values are now the old hard-coded values.

This will allow to develop a cli to trigger forges listing with oneshot policy
and some priority tasks. Thus ingesting those faster and without manual
interventation as we currently do.
2019-08-28 11:57:10 +02:00
Archit Agrawal
5ea9d5ed39 swh.lister.cran: Add description in task_dict
Add description in task_dict method because
the only metadata that can be found for a
package at CRAN is  its decsription.  That can
only br achived from the build in API in R,
which ister is already using. Hence instead of
getting metadata in loader, it is passed
by lister.
2019-06-27 14:57:51 +05:30
Valentin Lorentz
52b1de87c5 Finish dropping the 'description' column.
I missed some in aef7d5952e.
2019-06-26 14:46:27 +02:00
Valentin Lorentz
aef7d5952e Remove columns 'description' and 'origin_id'.
They are useless.
2019-06-19 10:29:15 +02:00
Archit Agrawal
a9a37a85bf swh.lister.cran
Add a lister to list all the CRAN packages .
It uses the build-in API in R language to list the packages
and get their metadata. 

Closes T1709
2019-06-11 21:26:31 +05:30