swh-lister/swh/lister
Antoine Lambert 35871896b2 pattern: Improve handling of max_origins_per_page parameter
Instead of fully consuming the get_origins_from_page generator into
a list and truncate it, prefer to consume the generator origin per
origin and abort the process when the max number of origin per page
is reached.

Indeed some non trivial listers like the cgit one can perform costly
processing, HTTP request for instance, for each origin in a page.
So better not consuming the full generator in a row to avoid such
side effects.
2023-03-21 16:56:48 +01:00
..
arch Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
aur Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
bitbucket bitbucket: Skip buggy page when listing 2023-03-10 15:37:44 +01:00
bower Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
cgit cgit/tasks: Allow passing extra parameters to task 2023-03-21 12:22:07 +01:00
conda Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
cpan Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
cran Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
crates Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
debian Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
fedora Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
gitea gogs, gitea: Fix task execution to pass along extra kwargs 2022-12-14 16:09:56 +01:00
github github: Fix fixtures use in tests 2023-01-02 18:06:26 +01:00
gitlab mypy: Bump to 1.0.1 and fix new typing errors 2023-02-17 17:56:07 +01:00
gnu Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
gogs gogs, gitea: Fix task execution to pass along extra kwargs 2022-12-14 16:09:56 +01:00
golang Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
hackage Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
hex fix(hex): Use page_size for stopping condition 2023-03-14 17:59:46 +00:00
launchpad Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
maven mypy: Bump to 1.0.1 and fix new typing errors 2023-02-17 17:56:07 +01:00
nixguix mypy: Bump to 1.0.1 and fix new typing errors 2023-02-17 17:56:07 +01:00
npm Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
nuget Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
opam Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
packagist Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
phabricator Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
pubdev Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
puppet Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
pypi Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
rubygems Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
sourceforge Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
tests feat: Add Hex.pm lister 2023-03-14 17:59:46 +00:00
tuleap Hook up recently introduced options to all listers 2022-12-05 16:33:45 +01:00
__init__.py Add support for more tarball recognition based on extensions 2022-10-25 09:50:31 +02:00
cli.py python: Reformat code with black 22.3.0 2022-04-08 15:15:09 +02:00
pattern.py pattern: Improve handling of max_origins_per_page parameter 2023-03-21 16:56:48 +01:00
py.typed typing: minimal changes to make a no-op mypy run pass 2019-10-28 15:35:21 +01:00
utils.py Validate origin URLs before sending to the scheduler 2022-11-04 15:58:45 +01:00