swh-lister/swh/lister
Raphaël Gomès f7b27c6930 Add a non-incremental sourceforge lister
Following zack's work on T735, this change introduces an actual SWH lister for
SourceForge.

SourceForge provides a main sitemap that lists sharded sitemaps, which
themselves list pages. Each page belongs to a project (or sub-project,
though those are rare), information about which can be found by querying
a REST API, which gives us the list of any and all VCS used for said
project. Both sitemaps and pages have a "last modified" timestamp that
will be used in a future patch to implement incremental listing.

More precise information can be found as inline comments or docstrings.
2021-03-23 18:40:21 +01:00
..
bitbucket Remove no longer used models field in dict returned by register 2021-02-02 16:33:52 +01:00
cgit cgit: remove the repository urls's trailing / 2021-02-01 17:31:08 +01:00
cran cran: Prevent multiple listing of an origin 2021-02-05 14:34:37 +01:00
debian debian: Update archive mirror URL templates to process 2021-02-08 14:01:59 +01:00
gitea Remove no longer used models field in dict returned by register 2021-02-02 16:33:52 +01:00
github GitHub: handle edge cases with empty responses 2021-03-19 16:53:52 +01:00
gitlab gitlab: Deal with missing or trailing / in url input 2021-01-28 10:46:58 +01:00
gnu gnu: Remove dependency on pytz 2021-02-02 13:19:04 +01:00
launchpad launchpad: Remove call to dataclasses.asdict on lister state 2021-01-28 19:17:58 +01:00
npm Remove no longer used models field in dict returned by register 2021-02-02 16:33:52 +01:00
packagist packagist: Reimplement lister using new Lister API 2021-02-02 14:48:47 +01:00
phabricator Remove no longer used models field in dict returned by register 2021-02-02 16:33:52 +01:00
pypi pypi: Use BeautifulSoup for parsing HTML instead of xmltodict 2021-02-05 14:23:11 +01:00
sourceforge Add a non-incremental sourceforge lister 2021-03-23 18:40:21 +01:00
tests Remove no longer used legacy Lister API and update CLI options 2021-02-02 15:54:55 +01:00
__init__.py Hook up listers implemented with the new pattern to the CLI 2021-01-11 11:00:29 +01:00
cli.py Remove no longer used legacy Lister API and update CLI options 2021-02-02 15:54:55 +01:00
pattern.py pattern: Bump packet split to chunk of 1000 records 2021-01-29 16:55:29 +01:00
py.typed typing: minimal changes to make a no-op mypy run pass 2019-10-28 15:35:21 +01:00
utils.py lister: Add utility decorator to ease HTTP requests rate limit handling 2021-01-18 11:28:51 +01:00