Commit graph

46 commits

Author SHA1 Message Date
David Douard
93a4d8b784 Enable black
- blackify all the python files,
- enable black in pre-commit,
- add a black tox environment.
2020-04-08 16:31:22 +02:00
Gautier Pugnonblanc Yann
e5fea84c55 review corrections 2020-02-20 09:13:49 +01:00
Gautier Pugnonblanc Yann
60adc424be add anotation type in some lister file 2020-02-17 15:58:34 +01:00
Antoine R. Dumont (@ardumont)
5ab9d67d67
core: Align listers' task output (hg/git tasks) with expected format
Related to T2134
Related to D2409
Related to D2410
2019-12-09 15:12:17 +01:00
Antoine R. Dumont (@ardumont)
4a9608f31c
lister/tasks: Standardize return statements
The following commit adapts the return statements from both lister and their
associated tasks. This standardizes on what other modules (e.g. both dvcs and
package loaders) do.
2019-12-02 15:49:38 +01:00
Antoine R. Dumont (@ardumont)
81a31f3c06
tests: Bump dependency on latest swh-core
This also modifies the test dataset to filename with url decoded filename.
As this is what the latest pytest plugin requires.
2019-11-06 15:01:05 +01:00
Stefano Zacchiroli
974f80f966 typing: minimal changes to make a no-op mypy run pass 2019-10-28 15:35:21 +01:00
Nicolas Dandrimont
78105940ff Stop binding tasks to a specific instance of the celery app
The celery.shared_task decorator allows late-binding of tasks to any celery app,
which is well suited for our "task plugin" architecture.
2019-10-18 18:02:25 +02:00
Antoine R. Dumont (@ardumont)
a8cde12d72
tests: Update pytest_plugin according to latest version change 2019-10-14 18:20:15 +02:00
Antoine R. Dumont (@ardumont)
f92ac83646
bitbucket.lister: Add integration test which checks scheduled tasks
Related T2032
2019-10-12 03:39:47 +02:00
Antoine Lambert
7572228f7c listers: Ensure run can be called without bounds arguments
Closes T2001
2019-09-17 15:09:04 +02:00
David Douard
b810876ef8 tasks: normalize the url argument name of most lister
Since all the listing tasks accepts an url as first argument (whatever the
argument name is), it makes sense to use a simple common argument name for
this. I've chosen 'url' instead of api_baseurl/forge_url/url.

Also kill now useless `new_lister()` functions.
2019-09-04 15:38:01 +02:00
David Douard
8d9deeb8f8 plugins: add support for scheduler's task-type declaration
Add a new register-task-types cli that will create missing task-type entries in the
scheduler according to:

- only create missing task-types (do not update them), but check that the
  backend_name field is consistent,
- each SWHTask-based task declared in a module listed in the 'task_modules'
  plugin registry field will be checked and added if needed; tasks which name
  start wit an underscore will not be added,
- added task-type will have:
  - the 'type' field is derived from the task's function name (with underscores
    replaced with dashes),
  - the description field is the first line of that function's docstring,
  - default values as provided by the swh.lister.cli.DEFAULT_TASK_TYPE (with
    a simple pattern matching to have decent default values for full/incremental
    tasks),
  - these default values can be overloaded via the 'task_type' plugin registry
    entry.

For this, we had to rename all tasks names (eg. `cran_lister` -> `list_cran`).

Comes with some tests.
2019-09-04 15:36:08 +02:00
David Douard
e3c0ea9d90 implement listers as plugins
Listers are declared as plugins via the `swh.workers` entry_point.

As such, the registry function is expected to return a dict with the
`task_modules` field (as for generic worker plugins), plus:

- `lister`: the lister class,
- `models`: list of SQLAlchemy models used by this lister,
- `init` (optionnal): hook (callable) used to initialize the lister's state
  (typically, create/initialize the database for this lister).
  If not set, the default implementation creates database tables (after
  optionally having deleted exisintg ones) according to models declared in
  the `models` register field.

There is no need for explicitely add lister task modules in the main
`conftest` module, but any new/extra lister to be tested must be registered
(the tested lister module must be properly installed in the test environment).

Also refactor a bit the cli tools:
- add support for the standard --config-file option at the 'lister' group
  level,
- move the --db-url to the 'lister' group,
- drop the --lister option for the `swh lister db-init` cli tool:
  initializing (especially with --drop-tables) the database for a single
  lister is unreliable, since all tables are created using a sibgle MetaData
  (in the same namespace).
2019-09-03 15:02:24 +02:00
David Douard
8950b0b32d bitbucket: make BitBucketLister's api_baseurl init argument optional 2019-09-02 12:29:38 +02:00
David Douard
e0ce68377d bitbucket: simplify a bit BitBucketLister's constructor
get rid of the "smart" flush_packet_db computation.
2019-08-30 17:56:19 +02:00
Stefano Zacchiroli
bb2dc77788 bitbucket lister: fix typo in docstring 2019-07-04 14:40:02 +02:00
Antoine R. Dumont (@ardumont)
3d473c307c
lister: Type correctly the 'indexable' column
instead of converting that column as a string

As a side effect, bitbucket wise, we provided improperly the after query
parameter as a date not url encoded. This resulted in improper api response from
bitbucket's (we received from time to time the same next index as the current
one).

Related T1826
2019-06-26 10:58:54 +02:00
Antoine R. Dumont (@ardumont)
b99617f976
relister: Fix consistently the behavior for the first time relisting
If nothing has been done prior to a full relisting, there is actually nothing
to list. So the relister in question does nothing.

In that context, the IndexingLister class's `db_partition_indices` method now
returns an empty list instead of raising a ValueError when there is nothing to
list.

Related T1826
Related e129e48
2019-06-25 14:48:17 +02:00
Antoine R. Dumont (@ardumont)
6662ae8db5
indexing_lister: Allow to define flush packet size
Prior to this commit, indexing lister instances were flushing every packet of
20. This can now be defined per sub classes.
2019-06-25 14:48:16 +02:00
Antoine R. Dumont (@ardumont)
45428c25df
bitbucket: Unify logging instructions 2019-06-25 14:09:59 +02:00
Antoine R. Dumont (@ardumont)
9aa8a6f7ae
bitbucket: Allow to specify the number of repos per api request
This is independent but still, it somehow fixes the issue occurring on T1826.

Related T1826
2019-06-21 17:50:23 +02:00
Antoine R. Dumont (@ardumont)
e129e48c31
bitbucket: Fix full lister with fallback [start, end] if not provided
Related T1826
2019-06-21 15:46:51 +02:00
Antoine R. Dumont (@ardumont)
b3463ecddc
Drop SWH prefix in classes everywhere
It's redundant with the swh modules in itself.
2019-06-20 19:08:46 +02:00
Valentin Lorentz
aef7d5952e Remove columns 'description' and 'origin_id'.
They are useless.
2019-06-19 10:29:15 +02:00
Antoine R. Dumont (@ardumont)
fc92c79b7e
models: Unify tablenames using singular as main archive's convention
Related P434
2019-06-18 07:18:34 +02:00
Antoine R. Dumont (@ardumont)
b81621274b
lister: Unify credentials structure between listers
This becomes a dictionary of key <lister-name>, value a dict of key
<instance-name>, value list of dict username/password.

Related T1772
2019-05-29 14:00:11 +02:00
David Douard
e5c3559033 tasks: fix handling of unsupported promise.save() calls
the exception can also be an AttributeError.

Also do not reraise this exception (in github/tasks.py). This promise
saving feature is used for tests.
2019-04-11 11:03:48 +02:00
David Douard
c2c26d7e46 Fix the bitbucket lister; handle properly the date-like bounds 2019-02-01 15:38:11 +01:00
David Douard
f670de298f Remove debug logging from tasks' code
since this is now handled by the SWHTask itself.
2019-01-17 13:58:29 +01:00
David Douard
e31b61bee1 Do not crash range tasks if celery result backend does not support saving the group's state 2019-01-15 15:32:07 +01:00
David Douard
028ceca90d Fix bitbucker lister: request_uri expect 'identifier' to be a date
which it won't be when the lister is called for the first time (since
the run() method will be be called with min_bond=None in this case).
2019-01-15 15:31:59 +01:00
David Douard
4fc1968f1f Rename the bitbucket and github listers to remove the 'tld' part
so that we can easily manage its configuration (especially in the docker
environment) by referring to this lister as only 'bitbucket' everywhere
(ie. python package name and config file names).
2019-01-14 12:07:57 +01:00
David Douard
f46f3e2015 Remove explicit setting of the task base class
since it's now the default base class in swh-scheduler (>= 0.0.39)
2019-01-10 09:55:17 +01:00
David Douard
f375df5892 Add tests for bitbucket tasks 2019-01-08 10:35:33 +01:00
David Douard
0583b0e685 Add a 'ping' task for every lister. 2019-01-08 10:35:33 +01:00
David Douard
2d1f0643ff Heavy refactor of the task system
Get rid of the class based task definition in favor of decorator-based
task declarations.

Doing so, we can get rid of core/tasks.py

Task names are explicitely set to keep compatibility with task
definitions in schedulers' database.

This also add debug statements at the beginning and end of each lister
task.
2019-01-08 10:33:32 +01:00
Antoine R. Dumont (@ardumont)
d88f1b60c9
core/lister: Make the tasks take an explicit lister_args argument
Avoid eating *all* arbitrary arguments and passing them along to the
new_lister method.
2018-07-17 15:48:48 +02:00
Antoine R. Dumont (@ardumont)
4db15aaf16
swh.lister.gitlab: Remove indexable column from gitlab lister 2018-07-12 13:41:47 +02:00
Antoine R. Dumont (@ardumont)
d640fdcc96
swh.lister.gitlab.tests: Separate properly tests per lister 2018-07-12 12:23:46 +02:00
Antoine R. Dumont (@ardumont)
4c4aa0ead2
swh.lister: Make LISTER_NAME a class attribute
swh.lister.gitlab: make the 'instance' a constructor parameter
2018-07-11 17:43:41 +02:00
Antoine R. Dumont (@ardumont)
7954e03627
swh.lister: Document swh.lister.tasks's intent
And remove uneeded indexing name from the RangeListerTask
2018-07-11 15:56:32 +02:00
Antoine R. Dumont (@ardumont)
ba146376d6
swh.lister: Add tests around the gitlab lister
Related T989
2018-07-11 15:56:32 +02:00
Antoine R. Dumont (@ardumont)
f4fe1b058b
swh.lister.*: Formatting 2018-07-03 12:17:46 +02:00
Nicolas Dandrimont
e477a46c60 Add missing __init__.py files
Helps with tests autodetection
2017-10-30 16:38:27 +01:00
Avi Kelman (fiendish)
68d77fd43f Refactor lister code
Streamline production of new listers by aggressively moving core
functionality into progressively inherited (A->B->C) base classes
with the transport layer abstracted.
This should make common individual forge listers straightforward to
produce with minimal customization. Github and Bitbucket listers
can be used as examples of the indexing type.
2017-03-06 12:35:49 +01:00