Commit graph

30 commits

Author SHA1 Message Date
Pierre-Yves David
08fda328be Migration to psycopg3 2025-03-21 17:05:07 +01:00
Antoine Lambert
fc98bc1035 cli: Replace scheduler temporary backend by memory one
Scheduler temporary backend has been removed in favor of a
more efficient memory backend.
2024-12-11 12:04:45 +01:00
Antoine Lambert
7609ebf7e1 pattern: Store termination date to scheduler database at end of listing
It enables to track last lister execution date and will be used to schedule
first visits with high priority for listed origins.

Related to swh/devel/swh-scheduler#4687.
2024-10-14 15:03:28 +02:00
Nicolas Dandrimont
f7abfafffe GitHub: record whether the origin is a fork
For now this information is not used downstream, but it can be useful
for specific analysis or one-shot scheduling.
2024-07-18 10:45:06 +02:00
Antoine Lambert
aaae1a6b0b launchpad, npm: Port code to updated swh-scheduler API
The oldest part of the scheduler API was updated to use model classes
(based on attr package) instead of dictionaries in order to improve
typing.
2024-05-22 17:44:00 +02:00
Nicolas Dandrimont
4bcf4a4147 swh-core's github extra isn't needed anymore 2023-11-14 19:25:13 +01:00
Antoine Lambert
7092e4e4ac cli: Use temporary scheduler as fallback when no configuration detected
In order to simplify the testing of listers, allow to call the run command
of swh-lister CLI without scheduler configuration. In that case a temporary
scheduler instance with a postgresql backend is created and used.

It enables to easily test a lister with the following command:

$ swh -l DEBUG lister run <lister_name> url=<forge_url>
2023-11-07 19:00:53 +01:00
Antoine Lambert
4f57e84450 Use http_retry decorator from swh.core.retry module
The http_retry decorator has been moved to swh-core package in order
to ease its reuse across swh packages.
2023-04-13 14:19:57 +02:00
Valentin Lorentz
a681f2f405 packagist: Canonicalize github origins
In particular, there seems to be a negligeable number of origins
using SSH instead of HTTPS, which the git loader cannot deal with.
2022-10-13 17:14:58 +02:00
Antoine R. Dumont (@ardumont)
fbfdf88ea4
nixguix: Add lister
Related to T3781
2022-10-03 18:26:36 +02:00
Antoine R. Dumont (@ardumont)
263db667d0
Adapt maven lister to list canonical gh urls if any
That means detected github urls {https,git,http}://github.com/${user_repo}(.git) are
canonicalized to https://github.com/${user_repo} format.

This avoids duplication of origins.

Related to T4232
2022-05-23 14:47:11 +02:00
Antoine R. Dumont (@ardumont)
2ffe9c2aea
Use swh.core.github.pytest_plugin in github tests
Related to T4232
2022-05-20 16:06:11 +02:00
Antoine R. Dumont (@ardumont)
d2f4781669
requirements: Rework dependencies
Without the following, and the new swh.scheduler bump, some dependencies were
no longer resolved properly.

Related to T2746
2020-11-23 15:39:51 +01:00
Antoine R. Dumont (@ardumont)
d7d38090f5
lister.config: Adapt scheduler configuration structure 2020-10-19 09:42:16 +02:00
David Douard
d10f78d80c Adapt cli declaration entrypoint to swh.core 0.3 2020-09-23 17:42:00 +02:00
Antoine R. Dumont (@ardumont)
81a31f3c06
tests: Bump dependency on latest swh-core
This also modifies the test dataset to filename with url decoded filename.
As this is what the latest pytest plugin requires.
2019-11-06 15:01:05 +01:00
Antoine R. Dumont (@ardumont)
56d7cff6e1
debian/model: Install lister model within the lister repository
This is no longer shared between the new debian loader and the lister.

The swh.storage.schemata module is still part of the swh.storage module though.
As this is still a dependency for the current swh.loader.debian production
loader. This will be cleaned up later.

Related D2135
2019-11-04 10:00:54 +01:00
Antoine R. Dumont (@ardumont)
a8cde12d72
tests: Update pytest_plugin according to latest version change 2019-10-14 18:20:15 +02:00
Antoine R. Dumont (@ardumont)
394658e53b
cgit.tests: Check the tasks from the scheduler 2019-10-09 17:57:57 +02:00
David Douard
e3c0ea9d90 implement listers as plugins
Listers are declared as plugins via the `swh.workers` entry_point.

As such, the registry function is expected to return a dict with the
`task_modules` field (as for generic worker plugins), plus:

- `lister`: the lister class,
- `models`: list of SQLAlchemy models used by this lister,
- `init` (optionnal): hook (callable) used to initialize the lister's state
  (typically, create/initialize the database for this lister).
  If not set, the default implementation creates database tables (after
  optionally having deleted exisintg ones) according to models declared in
  the `models` register field.

There is no need for explicitely add lister task modules in the main
`conftest` module, but any new/extra lister to be tested must be registered
(the tested lister module must be properly installed in the test environment).

Also refactor a bit the cli tools:
- add support for the standard --config-file option at the 'lister' group
  level,
- move the --db-url to the 'lister' group,
- drop the --lister option for the `swh lister db-init` cli tool:
  initializing (especially with --drop-tables) the database for a single
  lister is unreliable, since all tables are created using a sibgle MetaData
  (in the same namespace).
2019-09-03 15:02:24 +02:00
David Douard
c1a3c10c5c Bump dependencies 2019-02-06 15:38:11 +01:00
David Douard
6957f3c435 update dep on swh-scheduler>0.0.39 and pytest<4 (tests)
pytest<4 because of https://github.com/pytest-dev/pytest/issues/4641
2019-01-16 16:39:03 +01:00
David Douard
94c1eaf402 Fix: TaskType has been removed from scheduler 0.0.38 2018-12-20 15:06:19 +01:00
Antoine R. Dumont (@ardumont)
8b2ee221ac
core.lister_base: Batch create origins (storage) & tasks (scheduler) 2018-07-27 17:24:49 +02:00
Antoine R. Dumont (@ardumont)
b272a36237
core.lister_base: Refactor task creation 2018-07-27 17:24:49 +02:00
Antoine R. Dumont (@ardumont)
91f05745ac
core/lister: Make the listers' scheduler configuration adaptable
Related T1138
2018-07-18 12:19:21 +02:00
Nicolas Dandrimont
082f415952 swh.storage is the requirement with the schemata stuff 2017-10-30 17:06:02 +01:00
Sushant Sushant
83ebb95705 Add a lister for Debian-like package archives
This work is based on Sushant's internship and D229.
2017-10-04 12:43:09 +02:00
Nicolas Dandrimont
af60301f3a tasks: update to new swh.scheduler API 2017-06-12 15:40:28 +02:00
Antoine Pietri
ede9e5048c requirements: split internal and external requirements in two separate files 2017-02-09 14:32:02 +01:00