Commit graph

11 commits

Author SHA1 Message Date
Pierre-Yves David
08fda328be Migration to psycopg3 2025-03-21 17:05:07 +01:00
Antoine Lambert
3771a411ae
tests: Remove no longer needed pytest custom marker named db
This was used at the time we were building debian packages for
swh components but we no longer do that.
2025-02-17 16:29:09 +01:00
Antoine Lambert
41407e0eff Use beautifulsoup4 CSS selectors to simplify code and type checking
As the types-beautifulsoup4 package gets installed in the swh virtualenv
as it is a swh-scanner test dependency, some mypy errors were reported
related to beautifulsoup4 typing.

As the returned type for the find method of bs4 is the following union:
Tag | NavigableString | None, isinstance calls must be used to ensure
proper typing which is not great.

So prefer to use the select_one method instead where a simple None check
must be done to ensure typing is correct as it is returning Optional[Tag].
In a similar manner, replace use of find_all method by select method.

It also has the advantage to simplify the code.
2024-04-16 11:22:51 +02:00
Franck Bret
ebba50882f Revert "Remove spurious space"
This reverts commit c9e2339af9
2023-09-21 07:14:44 +00:00
Franck Bret
c9e2339af9 Remove spurious space 2023-09-20 17:01:35 +02:00
Antoine Lambert
6e7bc49ec7 Harmonize listers parameters and add test to check mandatory ones
Ensure that all lister classes have the same set of mandatory parameters
in their constructors, notably: scheduler, url, instance and credentials.

Add a new test checking listers classes have mandatory parameters declared
in their constructors. The purpose is to avoid deployment issues on staging
or production environment as celery tasks can fail to be executed if mandatory
parameters are not handled by listers.

Reated to swh/infra/sysadm-environment#5030.
2023-09-06 11:55:34 +02:00
Nicolas Dandrimont
e785e67315 Hook up recently introduced options to all listers
Hopefully one day we'll be able to replace all of this mess with PEP692
TypedDict kwargs, but that's only on track for Python 3.12.
2022-12-05 16:33:45 +01:00
Nicolas Dandrimont
a66e24bfa2 Ignore psqlrc when loading the rubygems database dump
The SQL dump contains ownership instructions that can't be run if you
don't have the right users in your database clusters. When someone has a
psqlrc with ON_ERROR_STOP, this fails the load of the dump.

Use the opportunity to trigger an exception when psql returns a non-zero
exit code, rather than continue with an empty/inconsistent database.
2022-12-05 13:52:23 +01:00
Antoine Lambert
82b936a277 rubygems: Fix debug log 2022-10-13 16:40:35 +02:00
Antoine Lambert
108816f232 rubygems: Use gems database dump to improve listing output
Instead of using an undocumented rubygems HTTP endpoint that only
gives us the names of the gems, prefer to exploit the daily PostgreSQL
dump of the rubygems.org database.

It enables to list all gems but also all versions of a gem and its
release artifacts. For each relase artifact, the following info are
extracted: version, download URL, sha256 checksum, release date
plus a couple of extra metadata.

The lister will now set list of artifacts and list of metadata as extra
loader arguments when sending a listed origin to the scheduler database.
A last_update date is also computed which should ensure loading tasks
for rubygems will be scheduled only when new releases are available since
last loadings.

To be noted, the lister will spawn a temporary postgres instance so this
require the initdb executable from postgres server installation to be
available in the execution environment.

Related to T1777
2022-10-07 16:54:48 +02:00
Franck Bret
52ccf49e11 RubyGems: List origins from https://rubygems.org
Related T1777
2022-09-29 14:19:06 +02:00