The implementation of `HTTPError` in `requests` does not guarantee that
the `response` property will always be set. So we need to ensure it is
not `None` before looking for the return code, for example.
This also makes mypy checks pass again, as `types-request` was updated
in 2.31.0.9 to better match this particular aspect. See:
https://github.com/python/typeshed/pull/10875
In contrary of gitea listing which does not require to provide the q query
parameter, it is required for the gogs case.
After reading the gogs source code, the /repos/search endpoint generates
a sql request of the form: "SELECT * FROM repos WHERE name LIKE '%{q}%'".
By setting the q parameter value to "_", the LIKE clause acts as a
wildcard and all repositories are ensured to be returned.
Fixes#4698.
This pushes the rather elementary logic within the lister's scope. This will simplify
and unify cli call between lister and scheduler clis. This will also allow to reduce
erroneous operations which can happen for example in the add-forge-now.
With the following, we will only have to provide the type and the instance, then
everything will be scheduled properly.
Refs. swh/devel/swh-lister#4693
Prior to this commit, the lister assumed authentication was required. It exists public
gogs instances which do not require it.
This also updates documentation to mention the usual api location. This is useful when
people wants to actually trigger a listing as a pre-check flight.
This drops repetitive instruction in the gitea lister as well.
Co-authored with Antoine Lambert (@anlambert) <anlambert@softwareheritage.org>.
Related to infra/sysadm-environment#4644
Numerous listers were using the same page_request method or equivalent
in their implementation so prefer to deduplicate that code by adding
an http_request method in base lister class: swh.lister.pattern.Lister.
That method simply wraps a call to requests.Session.request and logs
some useful info for debugging and error reporting, also an HTTPError
will be raised if a request ends up with an error.
All listers using that new method now benefit of requests retry when
an HTTP error occurs thanks to the use of the http_retry decorator.
Instead of retrying HTTP requests only for 429 status code by default,
prefer to use the generic retry policy enabling to also retry for status
codes >= 500 but also on ConnectionError exceptions.
Rename throttling_retry decorator to http_retry to reflect this change.
By using a single equality instead of checking len() then zip()
to check one by one, pytest can find the common/missing elements
and print them nicely when the two lists are unequal.
This removes code and adds support for incremental pagination.
While both are essentially the same lister now, it still makes sense to
keep the Gitea lister separate, in order to:
1. display them in different categories on https://archive.softwareheritage.org/
2. support possible divergence of APIs in the future