swh-lister/swh/lister
Raphaël Gomès 3baf1d0999 Make the SourceForge lister incremental
SourceForge's sitemaps (1 main one + many sharded) give us a "last
modified" date for every subsitemap and project, allowing us to perform
an incremental listing.

We store the subsitemaps' "last modified" dates in the lister state, as
well as those of the empty projects (projects which don't have any VCS
registered), and the rest comes from the already visited origins from
the database.

The tests try to cover the possible cases of a subsitemap that has
changed, one that hasn't, a project that has change, one that hasn't,
and same for an empty project.
2021-05-06 10:28:27 +02:00
..
bitbucket s/REST( API)?/API/ 2021-04-27 18:13:13 +02:00
cgit cgit: remove the repository urls's trailing / 2021-02-01 17:31:08 +01:00
cran cran: Prevent multiple listing of an origin 2021-02-05 14:34:37 +01:00
debian debian: Update archive mirror URL templates to process 2021-02-08 14:01:59 +01:00
gitea Remove no longer used models field in dict returned by register 2021-02-02 16:33:52 +01:00
github GitHub: handle edge cases with empty responses 2021-03-19 16:53:52 +01:00
gitlab gitlab: Deal with missing or trailing / in url input 2021-01-28 10:46:58 +01:00
gnu gnu: Remove dependency on pytz 2021-02-02 13:19:04 +01:00
launchpad launchpad: Remove call to dataclasses.asdict on lister state 2021-01-28 19:17:58 +01:00
npm Remove no longer used models field in dict returned by register 2021-02-02 16:33:52 +01:00
packagist packagist: Reimplement lister using new Lister API 2021-02-02 14:48:47 +01:00
phabricator Remove no longer used models field in dict returned by register 2021-02-02 16:33:52 +01:00
pypi pypi: Use BeautifulSoup for parsing HTML instead of xmltodict 2021-02-05 14:23:11 +01:00
sourceforge Make the SourceForge lister incremental 2021-05-06 10:28:27 +02:00
tests Remove no longer used legacy Lister API and update CLI options 2021-02-02 15:54:55 +01:00
__init__.py Hook up listers implemented with the new pattern to the CLI 2021-01-11 11:00:29 +01:00
cli.py Remove no longer used legacy Lister API and update CLI options 2021-02-02 15:54:55 +01:00
pattern.py Fix various Sphinx warnings 2021-04-13 21:56:08 +02:00
py.typed typing: minimal changes to make a no-op mypy run pass 2019-10-28 15:35:21 +01:00
utils.py lister: Add utility decorator to ease HTTP requests rate limit handling 2021-01-18 11:28:51 +01:00