Commit graph

495 commits

Author SHA1 Message Date
Antoine R. Dumont (@ardumont)
250160ad75
launchpad: Add missing copyright headers 2020-09-01 15:53:26 +02:00
Nicolas Dandrimont
211f4610df Move get_scheduler monkeypatching into an explicit pytest fixture
This allows us to actually run the lister instantiation code instead of relying
on the underlying structure of the lister object. In turn, this allows future
listers to use the scheduler right in their __init__.
2020-07-16 12:14:04 +02:00
Nicolas Dandrimont
c9963d4302 Use the new names for the swh.scheduler test fixtures 2020-07-09 17:06:50 +02:00
Léni Gauffier
1408517c08 Added GiteaLister
Summary: Lister implementation for Gitea, works for (T2313). For now because of https://github.com/go-gitea/gitea/issues/9165 it would require setting its param limit to 50.

Reviewers: #reviewers, ardumont

Reviewed By: #reviewers, ardumont

Subscribers: ardumont

Differential Revision: https://forge.softwareheritage.org/D3107
2020-06-10 17:04:28 +02:00
Léni Gauffier
58ef08b083 Added LaunchpadLister
Summary:
Related to T1734

From abandonned D2799

Reviewers: ardumont

Reviewed By: ardumont

Differential Revision: https://forge.softwareheritage.org/D2974
2020-04-12 01:00:12 +02:00
David Douard
93a4d8b784 Enable black
- blackify all the python files,
- enable black in pre-commit,
- add a black tox environment.
2020-04-08 16:31:22 +02:00
Gautier Pugnonblanc Yann
e5fea84c55 review corrections 2020-02-20 09:13:49 +01:00
Gautier Pugnonblanc Yann
60adc424be add anotation type in some lister file 2020-02-17 15:58:34 +01:00
Antoine R. Dumont (@ardumont)
73a33d9224
core.lister_base: Improve slightly docs and types 2020-01-20 10:42:58 +01:00
Antoine R. Dumont (@ardumont)
ed73cea771
github.lister: Filter out partial repositories which break listing
This commit fixes the repository mapping to model. It broke when the listed
repository was either None or missing the id field [1]

[1] https://sentry.softwareheritage.org/share/issue/532d682182fc43d6a7a99400e3928811/
2020-01-20 10:25:57 +01:00
Antoine Lambert
99fcd2b3f5 docs: Fix sphinx warnings
Related to T2188
2020-01-17 16:15:11 +01:00
Antoine R. Dumont (@ardumont)
4b383abc56
github.lister: Use Retry-After header when rate limit reached
Following the github's documentation [1]

[1] https://developer.github.com/v3/guides/best-practices-for-integrators/#dealing-with-abuse-rate-limits

Related to T2170
2020-01-17 10:37:53 +01:00
Antoine R. Dumont (@ardumont)
4761773631
cran.lister: Use cran's canonical url for origin url
Prior to this commit, we sent the origin url as a versioned artifact.
Now we send the origin url as a CRAN's canonical one, and the associated list
of artifacts found there (only 1 today).
2020-01-16 13:48:56 +01:00
Antoine R. Dumont (@ardumont)
767c4c6dc7
cran.lister: Version uid so we can list new package versions 2020-01-15 18:22:01 +01:00
Antoine R. Dumont (@ardumont)
0560b813b2
cran.lister: Adapt docstring sample accordingly 2020-01-09 10:54:49 +01:00
Antoine R. Dumont (@ardumont)
e1069f0c59
cran.lister: Align loading tasks' with loader's expectation 2020-01-09 10:18:51 +01:00
Antoine R. Dumont (@ardumont)
3f3f714c62
cran.lister: Move helper function to the bottom of the file 2020-01-06 16:55:23 +01:00
Antoine R. Dumont (@ardumont)
5b652b3070
lister.debian: Make debian init step idempotent and up-to-date 2019-12-19 13:58:11 +01:00
Antoine R. Dumont (@ardumont)
4b9f0e0553
lister_base: Split into chunks the tasks prior to creation
This decreases in smaller transaction which won't timeout

Related to T2160
2019-12-19 10:49:10 +01:00
Antoine R. Dumont (@ardumont)
5ab9d67d67
core: Align listers' task output (hg/git tasks) with expected format
Related to T2134
Related to D2409
Related to D2410
2019-12-09 15:12:17 +01:00
Antoine R. Dumont (@ardumont)
5d096d511c
npm: Align lister's loader output tasks with expected format
Related to T2134
2019-12-06 17:07:21 +01:00
Antoine R. Dumont (@ardumont)
4a9608f31c
lister/tasks: Standardize return statements
The following commit adapts the return statements from both lister and their
associated tasks. This standardizes on what other modules (e.g. both dvcs and
package loaders) do.
2019-12-02 15:49:38 +01:00
Nicolas Dandrimont
ff7fdf24db Use a uniform User-Agent on all listers
This also adds tests to make sure that we properly send our version number to
upstreams.
2019-11-22 15:49:23 +01:00
Nicolas Dandrimont
62dc4dc257 Use pkg_resources to get the package version instead of vcversioner 2019-11-22 15:49:23 +01:00
Antoine R. Dumont (@ardumont)
af04129d79
lister.pypi: Align lister with pypi package loader 2019-11-21 18:43:45 +01:00
Antoine R. Dumont (@ardumont)
6534df4122
lister.npm: Align lister with npm package loader 2019-11-21 18:43:32 +01:00
Antoine R. Dumont (@ardumont)
cb853f4898
lister.tests: Avoid duplication setup step
And remove unnecessary fixture redefinition which causes indirection.
2019-11-21 14:24:01 +01:00
David Douard
3ddfd00e90 Fix typos (and trailing ws) reported by codespell 2019-11-21 14:11:18 +01:00
Antoine R. Dumont (@ardumont)
1757b0112b
cran/gnu: Rename task_type to load-archive-files 2019-11-21 13:44:20 +01:00
Antoine R. Dumont (@ardumont)
1cf7c8e86b
lister.tests: Add missing task_type for package listers
The scheduler module no longer initializes itself those task_type.
2019-11-21 13:44:20 +01:00
Antoine R. Dumont (@ardumont)
484377cc13
lister.cli: Remove task type register cli
It's now defined in swh.scheduler
2019-11-18 10:41:46 +01:00
Antoine R. Dumont (@ardumont)
8d02458686
simple_lister: Flush to db more frequently 2019-11-15 11:48:09 +01:00
Antoine R. Dumont (@ardumont)
bf030c0f00
gnu.lister: Use url as primary key
Otherwise, we are failing unicity constraint.

Related to T2070
2019-11-15 11:24:21 +01:00
Antoine R. Dumont (@ardumont)
daa9a270fb
gnu.lister.tests: Add missing assertion 2019-11-15 10:49:13 +01:00
Antoine R. Dumont (@ardumont)
191043fff9
gnu.lister: Add missing retries_left parameter 2019-11-15 10:47:23 +01:00
Antoine R. Dumont (@ardumont)
d251201251
debian.models: Migrate tests from storage to debian lister model
Related bb5d405
2019-11-14 10:28:15 +01:00
Nicolas Dandrimont
b2e5ce32a9 Fix bogus NotImplementedError on Area.index_uris 2019-11-13 13:51:46 +01:00
Nicolas Dandrimont
773cd337f1 indexing lister: Avoid generating empty or duplicate ranges when partitioning 2019-11-12 17:54:51 +01:00
Nicolas Dandrimont
2c5528ef59 indexing lister: Force bounds of integer ranges to be integers 2019-11-12 17:52:13 +01:00
Nicolas Dandrimont
1e7a905a13 Support zero-based indexes in indexing_lister 2019-11-12 17:50:32 +01:00
Nicolas Dandrimont
e23960edc6 Add tests for IndexingLister.db_partition_indices function 2019-11-12 17:48:43 +01:00
Valentin Lorentz
8550c5607f Register lister tasks even if they do not derive from SWHTask.
This happens if swh.scheduler.celery_backend.config isn't imported
before the CLI runs.
2019-11-12 16:06:23 +01:00
Antoine R. Dumont (@ardumont)
ea7a08d05d
lister.debian: Actually use the db_engine passed to the hook function 2019-11-08 10:51:33 +01:00
Antoine R. Dumont (@ardumont)
eebbc859fc
lister.cli: Clarify configuration loading step 2019-11-08 10:50:51 +01:00
Antoine R. Dumont (@ardumont)
e8a67a7650
swh.lister: Remove completely references to swh.storage.schemata
Related to 56d7cff
2019-11-06 15:46:04 +01:00
Antoine R. Dumont (@ardumont)
81a31f3c06
tests: Bump dependency on latest swh-core
This also modifies the test dataset to filename with url decoded filename.
As this is what the latest pytest plugin requires.
2019-11-06 15:01:05 +01:00
Antoine R. Dumont (@ardumont)
e0dbca759c
lister.debian: Move run method parameters to constructor 2019-11-05 17:44:45 +01:00
Antoine R. Dumont (@ardumont)
b745c5a735
lister.debian: Default to run a listing on debian distribution
That fixes the `swh lister run --lister debian` cli entrypoint.
2019-11-05 10:35:51 +01:00
Antoine R. Dumont (@ardumont)
a60e0bbc41
lister.debian: Fix task creation
By adding a `retries_left`
2019-11-05 10:35:51 +01:00
Antoine R. Dumont (@ardumont)
f872792407
debian.lister: Send origin url as load-debian task parameter
Instead of the old origin dict. That's what the debian loaders (old and new)
expect.
2019-11-05 10:35:51 +01:00