Antoine R. Dumont (@ardumont)
e3c856b5ee
utils.split_range: Split into not overlapping ranges
...
Existing listers use the `is_within_bound` [1] method from the base lister.
This method uses inclusive boundaries in all cases.
As some "range" task listers [2] [3] are using `split_range` function to create
"overlapping" ranges, this can cause concurrent insert issues down the line [4].
This commit adapts the function `split_range` to make the generated ranges no
longer overlap.
[1]
https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/core/lister_base.py$194-199
[2]
https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gitlab/tasks.py$37-41
[3]
https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gitea/tasks.py$36-41
Related to T2577
2020-09-10 11:01:44 +02:00
Antoine R. Dumont (@ardumont)
725c1fe4ad
test_utils: Migrate to pytest
2020-09-09 18:48:07 +02:00
Antoine R. Dumont (@ardumont)
66a61f3dd2
gitea.tasks: Fix parameter name from 'sort' to 'order'
...
This fixes [1]
[1] https://sentry.softwareheritage.org/share/issue/b0119b56f24347bcb58ac28c68685c62/
2020-09-09 12:10:23 +02:00
Antoine R. Dumont (@ardumont)
7d44bc2e75
launchpad.tasks: Update copyright headers
2020-09-08 14:42:37 +02:00
Vincent SELLIER
74ca3d0b87
Launchpad: rename task name to match conventions
...
Related to T2358
2020-09-08 14:21:47 +02:00
Antoine R. Dumont (@ardumont)
5a5b7ef70b
tests: Separate lister instantiations
...
Prior to this commit, all listers were instantiated at the same time even if
only one was needed. This commit separates those instantiations.
The only drawback to this is the db model initialization which now happens at
each lister instantiation. This can be dealt with if needed at another time
though.
2020-09-02 12:49:00 +02:00
Antoine R. Dumont (@ardumont)
92422dcf75
pytest_plugin: Instantiate only lister with no particular setup
...
This should fix the remaining blocking problems in the jenkins build failure
report [1]
[1] https://jenkins.softwareheritage.org/view/Debian%20packages/job/debian/job/packages/job/DLS/job/gbp-buildpackage/78/consoleFull
2020-09-02 12:25:15 +02:00
Antoine R. Dumont (@ardumont)
9437a643ad
pytest: Define plugin and declare it in the root conftest
...
Then drop all unneeded and indirect imports
2020-09-02 12:25:15 +02:00
Antoine R. Dumont (@ardumont)
e99d3464e4
test_cli: Exclude launchpad lister from the check
...
This should fix the build [1]
[1] https://jenkins.softwareheritage.org/view/Debian%20packages/job/debian/job/packages/job/DLS/job/gbp-buildpackage/77/console
2020-09-01 15:55:24 +02:00
Antoine R. Dumont (@ardumont)
250160ad75
launchpad: Add missing copyright headers
2020-09-01 15:53:26 +02:00
Nicolas Dandrimont
211f4610df
Move get_scheduler monkeypatching into an explicit pytest fixture
...
This allows us to actually run the lister instantiation code instead of relying
on the underlying structure of the lister object. In turn, this allows future
listers to use the scheduler right in their __init__.
2020-07-16 12:14:04 +02:00
Nicolas Dandrimont
c9963d4302
Use the new names for the swh.scheduler test fixtures
2020-07-09 17:06:50 +02:00
Léni Gauffier
1408517c08
Added GiteaLister
...
Summary: Lister implementation for Gitea, works for (T2313). For now because of https://github.com/go-gitea/gitea/issues/9165 it would require setting its param limit to 50.
Reviewers: #reviewers, ardumont
Reviewed By: #reviewers, ardumont
Subscribers: ardumont
Differential Revision: https://forge.softwareheritage.org/D3107
2020-06-10 17:04:28 +02:00
Léni Gauffier
58ef08b083
Added LaunchpadLister
...
Summary:
Related to T1734
From abandonned D2799
Reviewers: ardumont
Reviewed By: ardumont
Differential Revision: https://forge.softwareheritage.org/D2974
2020-04-12 01:00:12 +02:00
David Douard
93a4d8b784
Enable black
...
- blackify all the python files,
- enable black in pre-commit,
- add a black tox environment.
2020-04-08 16:31:22 +02:00
Gautier Pugnonblanc Yann
e5fea84c55
review corrections
2020-02-20 09:13:49 +01:00
Gautier Pugnonblanc Yann
60adc424be
add anotation type in some lister file
2020-02-17 15:58:34 +01:00
Antoine R. Dumont (@ardumont)
73a33d9224
core.lister_base: Improve slightly docs and types
2020-01-20 10:42:58 +01:00
Antoine R. Dumont (@ardumont)
ed73cea771
github.lister: Filter out partial repositories which break listing
...
This commit fixes the repository mapping to model. It broke when the listed
repository was either None or missing the id field [1]
[1] https://sentry.softwareheritage.org/share/issue/532d682182fc43d6a7a99400e3928811/
2020-01-20 10:25:57 +01:00
Antoine Lambert
99fcd2b3f5
docs: Fix sphinx warnings
...
Related to T2188
2020-01-17 16:15:11 +01:00
Antoine R. Dumont (@ardumont)
4b383abc56
github.lister: Use Retry-After header when rate limit reached
...
Following the github's documentation [1]
[1] https://developer.github.com/v3/guides/best-practices-for-integrators/#dealing-with-abuse-rate-limits
Related to T2170
2020-01-17 10:37:53 +01:00
Antoine R. Dumont (@ardumont)
4761773631
cran.lister: Use cran's canonical url for origin url
...
Prior to this commit, we sent the origin url as a versioned artifact.
Now we send the origin url as a CRAN's canonical one, and the associated list
of artifacts found there (only 1 today).
2020-01-16 13:48:56 +01:00
Antoine R. Dumont (@ardumont)
767c4c6dc7
cran.lister: Version uid so we can list new package versions
2020-01-15 18:22:01 +01:00
Antoine R. Dumont (@ardumont)
0560b813b2
cran.lister: Adapt docstring sample accordingly
2020-01-09 10:54:49 +01:00
Antoine R. Dumont (@ardumont)
e1069f0c59
cran.lister: Align loading tasks' with loader's expectation
2020-01-09 10:18:51 +01:00
Antoine R. Dumont (@ardumont)
3f3f714c62
cran.lister: Move helper function to the bottom of the file
2020-01-06 16:55:23 +01:00
Antoine R. Dumont (@ardumont)
5b652b3070
lister.debian: Make debian init step idempotent and up-to-date
2019-12-19 13:58:11 +01:00
Antoine R. Dumont (@ardumont)
4b9f0e0553
lister_base: Split into chunks the tasks prior to creation
...
This decreases in smaller transaction which won't timeout
Related to T2160
2019-12-19 10:49:10 +01:00
Antoine R. Dumont (@ardumont)
5ab9d67d67
core: Align listers' task output (hg/git tasks) with expected format
...
Related to T2134
Related to D2409
Related to D2410
2019-12-09 15:12:17 +01:00
Antoine R. Dumont (@ardumont)
5d096d511c
npm: Align lister's loader output tasks with expected format
...
Related to T2134
2019-12-06 17:07:21 +01:00
Antoine R. Dumont (@ardumont)
4a9608f31c
lister/tasks: Standardize return statements
...
The following commit adapts the return statements from both lister and their
associated tasks. This standardizes on what other modules (e.g. both dvcs and
package loaders) do.
2019-12-02 15:49:38 +01:00
Nicolas Dandrimont
ff7fdf24db
Use a uniform User-Agent on all listers
...
This also adds tests to make sure that we properly send our version number to
upstreams.
2019-11-22 15:49:23 +01:00
Nicolas Dandrimont
62dc4dc257
Use pkg_resources to get the package version instead of vcversioner
2019-11-22 15:49:23 +01:00
Antoine R. Dumont (@ardumont)
af04129d79
lister.pypi: Align lister with pypi package loader
2019-11-21 18:43:45 +01:00
Antoine R. Dumont (@ardumont)
6534df4122
lister.npm: Align lister with npm package loader
2019-11-21 18:43:32 +01:00
Antoine R. Dumont (@ardumont)
cb853f4898
lister.tests: Avoid duplication setup step
...
And remove unnecessary fixture redefinition which causes indirection.
2019-11-21 14:24:01 +01:00
David Douard
3ddfd00e90
Fix typos (and trailing ws) reported by codespell
2019-11-21 14:11:18 +01:00
Antoine R. Dumont (@ardumont)
1757b0112b
cran/gnu: Rename task_type to load-archive-files
2019-11-21 13:44:20 +01:00
Antoine R. Dumont (@ardumont)
1cf7c8e86b
lister.tests: Add missing task_type for package listers
...
The scheduler module no longer initializes itself those task_type.
2019-11-21 13:44:20 +01:00
Antoine R. Dumont (@ardumont)
484377cc13
lister.cli: Remove task type register cli
...
It's now defined in swh.scheduler
2019-11-18 10:41:46 +01:00
Antoine R. Dumont (@ardumont)
8d02458686
simple_lister: Flush to db more frequently
2019-11-15 11:48:09 +01:00
Antoine R. Dumont (@ardumont)
bf030c0f00
gnu.lister: Use url as primary key
...
Otherwise, we are failing unicity constraint.
Related to T2070
2019-11-15 11:24:21 +01:00
Antoine R. Dumont (@ardumont)
daa9a270fb
gnu.lister.tests: Add missing assertion
2019-11-15 10:49:13 +01:00
Antoine R. Dumont (@ardumont)
191043fff9
gnu.lister: Add missing retries_left parameter
2019-11-15 10:47:23 +01:00
Antoine R. Dumont (@ardumont)
d251201251
debian.models: Migrate tests from storage to debian lister model
...
Related bb5d405
2019-11-14 10:28:15 +01:00
Nicolas Dandrimont
b2e5ce32a9
Fix bogus NotImplementedError on Area.index_uris
2019-11-13 13:51:46 +01:00
Nicolas Dandrimont
773cd337f1
indexing lister: Avoid generating empty or duplicate ranges when partitioning
2019-11-12 17:54:51 +01:00
Nicolas Dandrimont
2c5528ef59
indexing lister: Force bounds of integer ranges to be integers
2019-11-12 17:52:13 +01:00
Nicolas Dandrimont
1e7a905a13
Support zero-based indexes in indexing_lister
2019-11-12 17:50:32 +01:00
Nicolas Dandrimont
e23960edc6
Add tests for IndexingLister.db_partition_indices function
2019-11-12 17:48:43 +01:00