Commit graph

6 commits

Author SHA1 Message Date
Antoine Lambert
6e7bc49ec7 Harmonize listers parameters and add test to check mandatory ones
Ensure that all lister classes have the same set of mandatory parameters
in their constructors, notably: scheduler, url, instance and credentials.

Add a new test checking listers classes have mandatory parameters declared
in their constructors. The purpose is to avoid deployment issues on staging
or production environment as celery tasks can fail to be executed if mandatory
parameters are not handled by listers.

Reated to swh/infra/sysadm-environment#5030.
2023-09-06 11:55:34 +02:00
Nicolas Dandrimont
e785e67315 Hook up recently introduced options to all listers
Hopefully one day we'll be able to replace all of this mess with PEP692
TypedDict kwargs, but that's only on track for Python 3.12.
2022-12-05 16:33:45 +01:00
Antoine Lambert
4f6b3f3f09 conda: Yield listed origins after all artifacts in a page are processed
swh-scheduler will deduplicate listed origins according to their URL
and visit type but not according to their extra loader arguments.

Previously, listed origins were yielded after each processed artifact
in a page so we could lose some package version info due to the
deduplication process.

So ensure to yield listed origins once all artifacts in a page have
been processed.
2022-10-25 10:49:52 +02:00
Franck Bret
6f40d2c1a5 Conda: switch artifacts from dict to list
'artifacts' extra_loader_arguments should be a list
2022-09-30 15:55:53 +02:00
Antoine Lambert
8d85b2e4e8 pattern: Ensure accurate origin counts returned by run method
Previously, the run method was returning the total count of ListedOrigin
objects sent to scheduler database.

However, some listers can send multiple ListedOrigin objects for a given
origin URL during the listing process, for instance when an origin is
contained in multiple pages (e.g. gogs listing) or when the listing
is gathering multiple versions of an origin spread across multiple
pages (e.g. maven listing).

This changes ensures an accurate count of listed origins by maintaining
a set of origin URLs associated to the sent ListedOrigin objects.
2022-09-29 11:14:08 +02:00
Franck Bret
8ff418fbc2 Conda: List origins for Anaconda, the package manager that provides tooling for datascience
Related T4547
2022-09-27 14:17:26 +02:00