Antoine Lambert
701d833cdf
Update scheduler task names to new ones
...
Related T1508
2019-05-21 13:27:29 +02:00
archit
fedfd73c8e
swh.lister.phabricator
...
Add a lister of all hosted repositories on a Phabricator instance
Closes T808
2019-05-15 19:54:33 +05:30
Antoine Lambert
4efb2ce62b
core.lister_base: Ensure deterministic _task_key return value
2019-05-15 15:37:19 +02:00
Antoine Lambert
977d2459c3
Remove references to no more used lister_db_url conf entry
2019-05-13 15:18:56 +02:00
David Douard
e5c3559033
tasks: fix handling of unsupported promise.save() calls
...
the exception can also be an AttributeError.
Also do not reraise this exception (in github/tasks.py). This promise
saving feature is used for tests.
2019-04-11 11:03:48 +02:00
Antoine Lambert
0b8d1d464d
npm.lister: Update loading task name
...
Related T1508
2019-04-10 18:24:35 +02:00
Antoine Lambert
dac7777cd6
listers: Align config filename with production
2019-04-10 18:22:45 +02:00
Antoine Lambert
8ffd8dadef
cli: Fix initialization for all listers
...
Prior to this commit, initializing all listers was failing after the
debian lister processing because of global insert_minimum_data init
Related T1629
2019-04-10 11:33:55 +02:00
Archit Agrawal
be804be0fc
Removed unnecessary files
2019-03-20 00:45:30 +05:30
Archit Agrawal
d7ae2f1305
Updated README for listers
2019-03-19 23:11:44 +05:30
Nicolas Dandrimont
c574897e2a
Guesstimate partition boundaries from extrema rather than using expensive offsets
...
Summary:
Using order by and offset makes the partitioning a n^2 operation on the number
of entries in the table, rather than an instant operation when using
min/max.
This assumes the indexable column is more or less uniform, which is not exactly
true but not the worst approximation either.
Test Plan: tox
Reviewers: #reviewers, douardda
Reviewed By: #reviewers, douardda
Subscribers: douardda, swh-public-ci
Differential Revision: https://forge.softwareheritage.org/D1267
2019-03-19 13:51:38 +01:00
Antoine R. Dumont (@ardumont)
262c297a5e
lister.cli: Fix spelling typo
2019-02-14 10:47:28 +01:00
David Douard
1756e2efbf
Fix tests: config change lister_db_url -> lister had not been applied in tests
...
making 2 of them fail (in test_bb_lister.py and test_gh_lister.py).
2019-02-06 16:45:28 +01:00
David Douard
0720c8e12e
Kill (useless) --create-tables and --with-data cli command options
2019-02-06 15:38:11 +01:00
David Douard
662df8aea1
Change the lister_db_url config option into a 'lister' one
...
with conventional structure
{ 'cls': cls, 'args': {}}
r
2019-02-06 10:22:11 +01:00
David Douard
4f2580a97c
Make the --lister option of the cli tool a variadic argument
...
and add a 'all' possibel value for it, so that one can initialize all the
database for all listers at once.
2019-02-06 10:19:26 +01:00
David Douard
c2c26d7e46
Fix the bitbucket lister; handle properly the date-like bounds
2019-02-01 15:38:11 +01:00
David Douard
94a35f12aa
Fix the SWHIndexingLister.db_partition_indices
...
the db query actually returns a table-like (2d) structure.
2019-02-01 15:38:11 +01:00
David Douard
1b2d6895a9
Use the named logger instead of the root logger in lister_base.py
2019-02-01 15:38:11 +01:00
David Douard
1d7d9b6128
Log errors when fetching an url in SWHListerHttpTransport
2019-02-01 15:38:11 +01:00
Antoine Lambert
68eb727d6a
npm.tasks: Fix NpmVisitModel parameters format
2019-01-30 16:32:26 +01:00
David Douard
f670de298f
Remove debug logging from tasks' code
...
since this is now handled by the SWHTask itself.
2019-01-17 13:58:29 +01:00
David Douard
e6a4ae7619
flake8: remove unneeded imports
2019-01-15 18:17:20 +01:00
David Douard
e31b61bee1
Do not crash range tasks if celery result backend does not support saving the group's state
2019-01-15 15:32:07 +01:00
David Douard
a1ec4437e6
Add a few debug statements
2019-01-15 15:32:07 +01:00
David Douard
065d3a64fc
Fix SWHIndexingLister.db_partition_indices(); ensure partition size is not zero
...
when this lister is called for the first time, db_num_entries() may return
a null value, so the min() will also be zero, making the range() call crash.
2019-01-15 15:32:06 +01:00
David Douard
028ceca90d
Fix bitbucker lister: request_uri expect 'identifier' to be a date
...
which it won't be when the lister is called for the first time (since
the run() method will be be called with min_bond=None in this case).
2019-01-15 15:31:59 +01:00
David Douard
4fc1968f1f
Rename the bitbucket and github listers to remove the 'tld' part
...
so that we can easily manage its configuration (especially in the docker
environment) by referring to this lister as only 'bitbucket' everywhere
(ie. python package name and config file names).
2019-01-14 12:07:57 +01:00
David Douard
f46f3e2015
Remove explicit setting of the task base class
...
since it's now the default base class in swh-scheduler (>= 0.0.39)
2019-01-10 09:55:17 +01:00
David Douard
6030cb6315
Use pytest's conftest from swh-scheduler
2019-01-10 09:50:05 +01:00
David Douard
fb9265bb03
Generate the gitlab's instance name from the api_baseurl by default
...
using the host of the given url.
This allows to create a lister task by simply specify the API base url
and prevent 'inconsistent by default' behavior, eg. with:
swh-scheduler task add swh-lister-gitlab-full \
api_baseurl=https://0xacab.org/api/v4
the created task does not use 'gitlab' as instance name (but '0xacab.org'
here).
It's still possible to explicitely specify the instance name if needed.
2019-01-10 09:49:26 +01:00
David Douard
7db281aa38
Fix gitlab task: pass per_page lister arg paremeter to the lister constructor
2019-01-08 10:35:33 +01:00
David Douard
65f3b9edc8
Add tests for pypi tasks
2019-01-08 10:35:33 +01:00
David Douard
f63b8326c5
Add tests for npm tasks
2019-01-08 10:35:33 +01:00
David Douard
264e9ea574
Add tests for gitlab tasks
2019-01-08 10:35:33 +01:00
David Douard
33ec762bd4
Add tests for debian tasks
2019-01-08 10:35:33 +01:00
David Douard
f375df5892
Add tests for bitbucket tasks
2019-01-08 10:35:33 +01:00
David Douard
b7139619fd
Add tests for github tasks
...
in order to be able to run unit tests using celery pytest fixtures, we
use a dedicated swh_app fixture that ensure the "main" celery app
is the test app (otherwise subtasks won't work).
2019-01-08 10:35:33 +01:00
David Douard
dd104759ae
Small refactoring in indexing_lister
...
- use a generator instead of a while loop
- declare a local logger and use it instead of the root logger.
2019-01-08 10:35:33 +01:00
David Douard
0583b0e685
Add a 'ping' task for every lister.
2019-01-08 10:35:33 +01:00
David Douard
2d1f0643ff
Heavy refactor of the task system
...
Get rid of the class based task definition in favor of decorator-based
task declarations.
Doing so, we can get rid of core/tasks.py
Task names are explicitely set to keep compatibility with task
definitions in schedulers' database.
This also add debug statements at the beginning and end of each lister
task.
2019-01-08 10:33:32 +01:00
David Douard
94c1eaf402
Fix: TaskType has been removed from scheduler 0.0.38
2018-12-20 15:06:19 +01:00
David Douard
72658ff720
cli: fix debian lister so it also uses config overrides
2018-12-20 15:01:18 +01:00
David Douard
5ff8093c5d
Simplify listers Model constructor
...
the default implementation of SQLAlchemy's declarative API should
work just fine.
2018-12-12 18:27:11 +01:00
David Douard
532fe90df5
Kill the XMLRPC transport
...
it looks unused.
2018-12-12 18:25:25 +01:00
Antoine Lambert
ffe4ac9a3c
swh.lister.npm: Add an incremental npm lister
...
This new lister enables to get only new or updated npm packages
since the last listing operation.
Related T1378
Closes T1398
2018-12-03 17:58:27 +01:00
Antoine Lambert
605a67f51d
swh.lister.npm : Add a lister of all available packages in the npm registry
...
Related T1378
Closes T1380
2018-11-26 11:04:13 +01:00
David Douard
849b909a52
Fix rst syntax in docstrings
2018-11-09 11:20:31 +01:00
David Douard
cd20f4d223
Prevent deprecation warnings (logging.warn -> logging.warning)
2018-10-29 10:11:00 +01:00
David Douard
53840ac44b
Prevent a flake8 error
2018-10-25 17:42:59 +02:00