Commit graph

585 commits

Author SHA1 Message Date
Antoine R. Dumont (@ardumont)
8c5329ce7c
core.simple_lister: Move import instruction to top 2018-09-06 12:18:56 +02:00
Antoine R. Dumont (@ardumont)
3bb12649ec
pypi.lister: Shuffle the package list 2018-09-06 12:18:37 +02:00
Antoine R. Dumont (@ardumont)
86cfac9277
pypi.lister: Improve parameter names 2018-09-06 09:56:48 +02:00
Antoine R. Dumont (@ardumont)
267ce9b463
swh.lister.pypi: Adapt task creation dict according to latest dev 2018-08-01 14:52:35 +02:00
Antoine R. Dumont (@ardumont)
34eddb1fd7
core/simple_lister: Use bigger batch of data 2018-08-01 10:25:28 +02:00
Antoine R. Dumont (@ardumont)
3a65fbb4c8
swh.lister.pypi: Use pypi's legacy html based api to list packages
The xmlrpc is marked as deprecated [1]. Even if it's not now, the
legacy api is not marked as deprecated. So moving towards this one
sounds more reasonable [2].

[1] https://warehouse.readthedocs.io/api-reference/xml-rpc/#pypi-s-xml-rpc-methods

[2] https://warehouse.readthedocs.io/api-reference/legacy/#simple-project-api

Related T422
2018-08-01 10:25:28 +02:00
Antoine R. Dumont (@ardumont)
6ff3b90859
swh.lister.pypi: Add a pypi lister implementation using xmlprc api
Based solely on pypi's deprecated xmlrpc api [1].  No other way of listing
pypi.org is referenced (except for parsing an html page through a
legacy api [2])

[1] https://warehouse.readthedocs.io/api-reference/xml-rpc/#pypi-s-xml-rpc-methods

[2] https://pypi.python.org/simple/

Related T422
2018-08-01 10:25:21 +02:00
Antoine R. Dumont (@ardumont)
8b2ee221ac
core.lister_base: Batch create origins (storage) & tasks (scheduler) 2018-07-27 17:24:49 +02:00
Antoine R. Dumont (@ardumont)
b272a36237
core.lister_base: Refactor task creation 2018-07-27 17:24:49 +02:00
Antoine R. Dumont (@ardumont)
3e3b441646
core.lister_transports: Do not use bare except 2018-07-27 17:24:47 +02:00
Antoine R. Dumont (@ardumont)
b6b588dbbb
lister.cli: Insert optional flag to permit post insert data
That's needed for example for having the minimum necessary to make the
debian lister run.
2018-07-27 11:13:13 +02:00
Antoine R. Dumont (@ardumont)
2c69b586bc
Revert "lister.cli: Insert optional flag to permit post insert data"
This reverts commit e61512afc7.

This commit contains one section not supposed to be there yet (pypi is
not ready)
2018-07-27 11:12:01 +02:00
Antoine R. Dumont (@ardumont)
e61512afc7
lister.cli: Insert optional flag to permit post insert data
That's needed for example for having the minimum necessary to make the
debian lister run.
2018-07-27 11:06:22 +02:00
Antoine R. Dumont (@ardumont)
726d45b182
swh.lister.cli: Factorize supported listers 2018-07-27 10:21:38 +02:00
Antoine R. Dumont (@ardumont)
6c54b64a8f
swh.lister.cli: Add debian lister to the list of supported listers 2018-07-27 10:19:48 +02:00
Antoine R. Dumont (@ardumont)
364786a2da
lister/gitlab: Allow to define the per page elements to read 2018-07-20 13:41:25 +02:00
Antoine R. Dumont (@ardumont)
ff3afe391c
lister/core: Fix missing use case about no response from api server
UnboundLocalError could happen otherwise
2018-07-20 13:33:36 +02:00
Antoine R. Dumont (@ardumont)
bbef4bdeae
swh.lister.gitlab.tasks: Use gitlab as instance name for gitlab.com 2018-07-19 11:28:51 +02:00
Antoine R. Dumont (@ardumont)
30e14677e7
core/lister_base: Remove unused import 2018-07-18 16:05:16 +02:00
Antoine R. Dumont (@ardumont)
91f05745ac
core/lister: Make the listers' scheduler configuration adaptable
Related T1138
2018-07-18 12:19:21 +02:00
Antoine R. Dumont (@ardumont)
63cff5b337
lister.cli: Fix broken imports 2018-07-17 15:48:48 +02:00
Antoine R. Dumont (@ardumont)
d88f1b60c9
core/lister: Make the tasks take an explicit lister_args argument
Avoid eating *all* arbitrary arguments and passing them along to the
new_lister method.
2018-07-17 15:48:48 +02:00
Antoine R. Dumont (@ardumont)
d08ab241f5
gitlab/lister: Remove unused import 2018-07-17 11:45:07 +02:00
Antoine R. Dumont (@ardumont)
0292bd8cd4
core/lister: Rename module paging_lister to page_by_page_lister 2018-07-17 11:40:21 +02:00
Antoine R. Dumont (@ardumont)
5003a0475b
gitlab.lister: Fix buggy call to self._get_int
Was missing the dict to read from
2018-07-17 11:37:53 +02:00
Antoine R. Dumont (@ardumont)
e24cf1d4f1
gitlab.lister: Simplify retrieving headers information
As response headers' keys are case-insensitive and requests does the
aggregation magic.

[1] http://docs.python-requests.org/en/master/user/quickstart/#response-headers
2018-07-16 13:57:34 +02:00
Antoine R. Dumont (@ardumont)
ec6968e31a
swh.lister.core.paging_lister: Fix page initialization 2018-07-12 14:34:25 +02:00
Antoine R. Dumont (@ardumont)
81fd5f9c5d
swh.lister.gitlab.tasks: Fix range computations 2018-07-12 14:23:14 +02:00
Antoine R. Dumont (@ardumont)
a69e576c85
swh.lister.gitlab: Fix the total pages reading instruction 2018-07-12 13:52:27 +02:00
Antoine R. Dumont (@ardumont)
cd98af7705
swh.lister.gitlab: Change uid format 2018-07-12 13:46:03 +02:00
Antoine R. Dumont (@ardumont)
4db15aaf16
swh.lister.gitlab: Remove indexable column from gitlab lister 2018-07-12 13:41:47 +02:00
Antoine R. Dumont (@ardumont)
2648f1ae2e
swh.lister.gitlab: Read next page from headers 2018-07-12 12:23:46 +02:00
Antoine R. Dumont (@ardumont)
d640fdcc96
swh.lister.gitlab.tests: Separate properly tests per lister 2018-07-12 12:23:46 +02:00
Antoine R. Dumont (@ardumont)
b9544c77f4
swh.lister.gitlab: Add the presence check for the incremental lister 2018-07-11 19:15:54 +02:00
Antoine R. Dumont (@ardumont)
d520891547
swh.lister.core.paging_lister: Adding comments 2018-07-11 18:29:38 +02:00
Antoine R. Dumont (@ardumont)
74d8375261
swh.lister.gitlab.tasks: Remove spurious comma 2018-07-11 18:10:35 +02:00
Antoine R. Dumont (@ardumont)
13bb7aca58
swh.lister.gitlab: Improve headers extraction 2018-07-11 18:09:18 +02:00
Antoine R. Dumont (@ardumont)
847a8d341a
swh.lister.gitlab: Add Incremental lister behavior
Related T989
2018-07-11 17:43:41 +02:00
Antoine R. Dumont (@ardumont)
ccd0525c9b
swh.lister: Do not hardcode the index notion into parameter names 2018-07-11 17:43:41 +02:00
Antoine R. Dumont (@ardumont)
b6c5865ab1
swh.lister.paging_lister: Improve lister's base class name
Also drop the SWH prefix as this is redundant.
2018-07-11 17:43:41 +02:00
Antoine R. Dumont (@ardumont)
4c4aa0ead2
swh.lister: Make LISTER_NAME a class attribute
swh.lister.gitlab: make the 'instance' a constructor parameter
2018-07-11 17:43:41 +02:00
Antoine R. Dumont (@ardumont)
581028cfc5
swh.lister.cli: Fix cli docstring 2018-07-11 15:56:33 +02:00
Antoine R. Dumont (@ardumont)
a51c36194e
swh.lister.gitlab: Add full gitlab lister
Related T989
2018-07-11 15:56:32 +02:00
Antoine R. Dumont (@ardumont)
7954e03627
swh.lister: Document swh.lister.tasks's intent
And remove uneeded indexing name from the RangeListerTask
2018-07-11 15:56:32 +02:00
Antoine R. Dumont (@ardumont)
ba146376d6
swh.lister: Add tests around the gitlab lister
Related T989
2018-07-11 15:56:32 +02:00
Antoine R. Dumont (@ardumont)
e1a460caa5
swh.lister.gitlab: Improve docstring 2018-07-11 15:56:32 +02:00
Antoine R. Dumont (@ardumont)
3ca566776f
swh.lister.gitlab: Make rate limit check optional
Samples:
- https://0xacab.org/api/v4/projects/
- https://framagit.org/api/v4/projects/
- https://salsa.debian.org/api/v4/projects/
- https://gitlab.com/api/v4/projects/
- https://gitlab.freedesktop.org/api/v4/projects/
- https://gitlab.gnome.org/api/v4/projects/
- https://gitlab.inria.fr/api/v4/projects/

Related T989
2018-07-11 11:26:19 +02:00
Antoine R. Dumont (@ardumont)
79cd00737f
swh.lister.gitlab: Remove TODO about the 403 response code
Multiple issues wish for the api to converge on 429 but nothing is
clear nor documented yet:
- https://gitlab.com/gitlab-com/infrastructure/issues/348
- https://gitlab.com/gitlab-org/gitlab-ce/issues/41309
- https://gitlab.com/gitlab-org/gitlab-ce/issues/46522

The only response code mentioned in the documentation is
403 (https://docs.gitlab.com/ee/api/README.html#status-codes).
2018-07-11 11:26:19 +02:00
Antoine R. Dumont (@ardumont)
935b9cd24f
swh.lister.core: Make gitlab lister a paging lister instance
Related T989
2018-07-11 11:26:19 +02:00
Antoine R. Dumont (@ardumont)
db36c499fe
swh.lister.gitlab: Do not store information we cannot have 2018-07-11 11:26:18 +02:00