Commit graph

16 commits

Author SHA1 Message Date
Antoine Lambert
323e277482 gitea, gogs: Ensure query parameters are not duplicated in API URLs
Gitea API return next pagination link with all query parameters provided
to an API request.

As we were also passing a dict of fixed query parameters to the page_request
method, some query parameters ended up having multiple instances in the URL
for fetching a new page of repositories data. So each time a new page was
requested, new instances of these parameters were appended to the URL which
could result in a really long URL if the number of pages to retrieve is high
and make the request fail.

Also remove a debug log already present in http_request method.
2024-06-05 15:27:58 +02:00
David Douard
714fccc3c7 python: Fix black formatting after bump to 23.1.0 in pre-commit 2023-12-05 10:33:07 +01:00
Jérémy Bobbio (Lunar)
7344d264e7 Ensure HTTPError.response is not None
The implementation of `HTTPError` in `requests` does not guarantee that
the `response` property will always be set. So we need to ensure it is
not `None` before looking for the return code, for example.

This also makes mypy checks pass again, as `types-request` was updated
in 2.31.0.9 to better match this particular aspect. See:
https://github.com/python/typeshed/pull/10875
2023-10-18 10:41:57 +02:00
Antoine Lambert
b9815ed577 gogs: Ensure to list all repositories
In contrary of gitea listing which does not require to provide the q query
parameter, it is required for the gogs case.

After reading the gogs source code, the /repos/search endpoint generates
a sql request of the form: "SELECT * FROM repos WHERE name LIKE '%{q}%'".
By setting the q parameter value to "_", the LIKE clause acts as a
wildcard and all repositories are ensured to be returned.

Fixes #4698.
2023-06-26 15:16:48 +00:00
Antoine R. Dumont (@ardumont)
19bdeefb14
lister: Allow lister to build url out of the instance parameter
This pushes the rather elementary logic within the lister's scope. This will simplify
and unify cli call between lister and scheduler clis. This will also allow to reduce
erroneous operations which can happen for example in the add-forge-now.

With the following, we will only have to provide the type and the instance, then
everything will be scheduled properly.

Refs. swh/devel/swh-lister#4693
2023-05-19 15:03:49 +02:00
Antoine R. Dumont (@ardumont)
b3b5639e9a
gogs, gitea: Fix task execution to pass along extra kwargs
Related to https://gitlab.softwareheritage.org/infra/sysadm-environment/-/issues/4684
2022-12-14 16:09:56 +01:00
Nicolas Dandrimont
e785e67315 Hook up recently introduced options to all listers
Hopefully one day we'll be able to replace all of this mess with PEP692
TypedDict kwargs, but that's only on track for Python 3.12.
2022-12-05 16:33:45 +01:00
Antoine R. Dumont (@ardumont)
8a82bbf95f
gogs/lister: Allow public gogs instance listing
Prior to this commit, the lister assumed authentication was required. It exists public
gogs instances which do not require it.

This also updates documentation to mention the usual api location. This is useful when
people wants to actually trigger a listing as a pre-check flight.

This drops repetitive instruction in the gitea lister as well.

Co-authored with Antoine Lambert (@anlambert) <anlambert@softwareheritage.org>.

Related to infra/sysadm-environment#4644
2022-10-21 18:21:18 +02:00
Antoine Lambert
db6ce12e9e Refactor and deduplicate HTTP requests code in listers
Numerous listers were using the same page_request method or equivalent
in their implementation so prefer to deduplicate that code by adding
an http_request method in base lister class: swh.lister.pattern.Lister.

That method simply wraps a call to requests.Session.request and logs
some useful info for debugging and error reporting, also an HTTPError
will be raised if a request ends up with an error.

All listers using that new method now benefit of requests retry when
an HTTP error occurs thanks to the use of the http_retry decorator.
2022-09-26 10:48:40 +02:00
Antoine Lambert
9c55acd286 Use generic HTTP retry policy by default and rename dedicated decorator
Instead of retrying HTTP requests only for 429 status code by default,
prefer to use the generic retry policy enabling to also retry for status
codes >= 500 but also on ConnectionError exceptions.

Rename throttling_retry decorator to http_retry to reflect this change.
2022-09-26 10:48:40 +02:00
KShivendu
bd35d54398 gogs: Skip pages with error 500
This also affects the gitea lister
2022-09-20 19:05:20 +05:30
Valentin Lorentz
b7ec6cb120 tests: Simplify origin comparison and improve pytest diff on failure
By using a single equality instead of checking len() then zip()
to check one by one, pytest can find the common/missing elements
and print them nicely when the two lists are unequal.
2022-08-24 17:21:24 +02:00
Valentin Lorentz
31c44330e8 gogs: Lower unnecessarily verbose logging statement 2022-08-23 13:40:19 +02:00
Valentin Lorentz
17a219ece0 gitea: Inherit from Gogs lister
This removes code and adds support for incremental pagination.

While both are essentially the same lister now, it still makes sense to
keep the Gitea lister separate, in order to:

1. display them in different categories on https://archive.softwareheritage.org/
2. support possible divergence of APIs in the future
2022-08-23 13:38:32 +02:00
KShivendu
6a53a6ad06 feat: Make the Gogs lister incremental 2022-08-17 15:01:32 +05:30
KShivendu
d34a6232a6 gogs: Introduce Gogs lister 2022-08-03 16:22:06 +05:30