Commit graph

47 commits

Author SHA1 Message Date
Valentin Lorentz
596e8c6c40 Fix crash of 'swh lister run' when called without -l
```
$ swh lister run
Traceback (most recent call last):
  File "/home/dev/.local/bin/swh", line 33, in <module>
    sys.exit(load_entry_point('swh.core', 'console_scripts', 'swh')())
  File "/home/dev/swh-environment/swh-core/swh/core/cli/__init__.py", line 144, in main
    return swh(auto_envvar_prefix="SWH")
  File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/dev/.local/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/dev/swh-environment/swh-lister/swh/lister/cli.py", line 68, in run
    get_lister(lister, **config).run()
  File "/home/dev/swh-environment/swh-lister/swh/lister/__init__.py", line 75, in get_lister
    raise ValueError(
ValueError: Invalid lister None: only supported listers are ['arch', 'aur', 'bitbucket', 'bower', 'cgit', 'conda', 'cpan', 'cran', 'crates', 'debian', 'fedora', 'gitea', 'github', 'gitlab', 'gnu', 'gogs', 'golang', 'hackage', 'hex', 'launchpad', 'maven', 'nixguix', 'npm', 'nuget', 'opam', 'packagist', 'phabricator', 'pubdev', 'puppet', 'pypi', 'rubygems', 'sourceforge', 'tuleap']
```
2023-05-10 10:19:26 +02:00
Antoine Lambert
d38e05cff7 python: Reformat code with black 22.3.0
Related to T3922
2022-04-08 15:15:09 +02:00
Antoine Lambert
8933544521 Remove no longer used legacy Lister API and update CLI options
Legacy Lister classes from the swh.lister.core mdule are no longer
used in swh-lister codebase so it is time to remove them.

Also remove lister CLI options related to legacy Lister API.

As a consequence, the following requirements are no longer needed:
arrow, SQLAlchemy, sqlalchemy-stubs and testing.postgresql.

Closes T2442
2021-02-02 15:54:55 +01:00
Antoine R. Dumont (@ardumont)
17b0e7af26
cli: Make cli work with new lister
while allowing legacy lister to still run (with --legacy)
2021-01-28 09:12:56 +01:00
Nicolas Dandrimont
5e4bb28398 Run isort after the CLI import changes 2020-09-25 14:19:21 +02:00
David Douard
d10f78d80c Adapt cli declaration entrypoint to swh.core 0.3 2020-09-23 17:42:00 +02:00
Antoine Lambert
22f7181294 python: Reorder imports with isort
Related to T2610
2020-09-17 17:48:27 +02:00
David Douard
a00f151462 cli: speedup the swh cli command startup time
by moving import statements in functions.

Related to T2575.
2020-09-10 17:15:38 +02:00
David Douard
93a4d8b784 Enable black
- blackify all the python files,
- enable black in pre-commit,
- add a black tox environment.
2020-04-08 16:31:22 +02:00
Antoine R. Dumont (@ardumont)
484377cc13
lister.cli: Remove task type register cli
It's now defined in swh.scheduler
2019-11-18 10:41:46 +01:00
Valentin Lorentz
8550c5607f Register lister tasks even if they do not derive from SWHTask.
This happens if swh.scheduler.celery_backend.config isn't imported
before the CLI runs.
2019-11-12 16:06:23 +01:00
Antoine R. Dumont (@ardumont)
eebbc859fc
lister.cli: Clarify configuration loading step 2019-11-08 10:50:51 +01:00
Antoine R. Dumont (@ardumont)
394658e53b
cgit.tests: Check the tasks from the scheduler 2019-10-09 17:57:57 +02:00
David Douard
b810876ef8 tasks: normalize the url argument name of most lister
Since all the listing tasks accepts an url as first argument (whatever the
argument name is), it makes sense to use a simple common argument name for
this. I've chosen 'url' instead of api_baseurl/forge_url/url.

Also kill now useless `new_lister()` functions.
2019-09-04 15:38:01 +02:00
David Douard
8d9deeb8f8 plugins: add support for scheduler's task-type declaration
Add a new register-task-types cli that will create missing task-type entries in the
scheduler according to:

- only create missing task-types (do not update them), but check that the
  backend_name field is consistent,
- each SWHTask-based task declared in a module listed in the 'task_modules'
  plugin registry field will be checked and added if needed; tasks which name
  start wit an underscore will not be added,
- added task-type will have:
  - the 'type' field is derived from the task's function name (with underscores
    replaced with dashes),
  - the description field is the first line of that function's docstring,
  - default values as provided by the swh.lister.cli.DEFAULT_TASK_TYPE (with
    a simple pattern matching to have decent default values for full/incremental
    tasks),
  - these default values can be overloaded via the 'task_type' plugin registry
    entry.

For this, we had to rename all tasks names (eg. `cran_lister` -> `list_cran`).

Comes with some tests.
2019-09-04 15:36:08 +02:00
David Douard
e3c0ea9d90 implement listers as plugins
Listers are declared as plugins via the `swh.workers` entry_point.

As such, the registry function is expected to return a dict with the
`task_modules` field (as for generic worker plugins), plus:

- `lister`: the lister class,
- `models`: list of SQLAlchemy models used by this lister,
- `init` (optionnal): hook (callable) used to initialize the lister's state
  (typically, create/initialize the database for this lister).
  If not set, the default implementation creates database tables (after
  optionally having deleted exisintg ones) according to models declared in
  the `models` register field.

There is no need for explicitely add lister task modules in the main
`conftest` module, but any new/extra lister to be tested must be registered
(the tested lister module must be properly installed in the test environment).

Also refactor a bit the cli tools:
- add support for the standard --config-file option at the 'lister' group
  level,
- move the --db-url to the 'lister' group,
- drop the --lister option for the `swh lister db-init` cli tool:
  initializing (especially with --drop-tables) the database for a single
  lister is unreliable, since all tables are created using a sibgle MetaData
  (in the same namespace).
2019-09-03 15:02:24 +02:00
David Douard
c67a926f26 npm: make NpmVisitModel use the main declarative base class from core.models
This is needed by the (refactored) db init mechanism, since this later uses
the main declarative base class (thus the main MetaData instance) to gather
tables to be created/dropped.
2019-09-03 15:02:24 +02:00
David Douard
3816b4d3bf cgit: rewrite the CGit lister
Simplify the code:
- do only inherit from ListerBase
- implement HTTP queries directly using requests
- get rid of convoluted code

Make the origin_url gathered from the git repo's "project" page instead of
building it from the 'url_prefix' hack. Now, the lister WILL make substancially
more requests, since it will make one request per listed git repo, but
the provided origin_url should be pretty reliable now.

When several url are provided as clonable URLs, choose the http/https one first,
otherwise, choose the first one of the list.

Add proper tests for the cgit lister.

Also, get rid of the 'time_updated' column in the model.
2019-09-02 12:29:31 +02:00
Antoine R. Dumont (@ardumont)
4b2ab0488a
cli: Unify new_lister method name to get_lister 2019-08-28 16:29:26 +02:00
Antoine R. Dumont (@ardumont)
e0664c10cd
lister.cli: Allow to list forges with policy and priority
Example use case:

swh lister run --lister gitlab \
               --priority high \
               --policy oneshot \
               --db-url postgresql://postgres@localhost:5432/swh-listers \
               api_baseurl=https://gitlab.ow2.org/api/v4/

Related T1919
2019-08-28 16:29:26 +02:00
Archit Agrawal
5727f15cf3 swh.lister.packagist
Implement a packagist lister to list the
names and metadata url of all the
packages.

Closes 1776
2019-07-19 19:59:30 +05:30
Archit Agrawal
0bf24469b7 swh.lister.cgit: Remove repo page visit step
Remove the need to visit every page and extract the
origin url by introducing a parameter url_prefix.
The origin url is in format <prefix>/<repo_name> where
The prefix is same for all the repos for a particular
cgit instance.
2019-06-28 20:02:07 +05:30
Archit Agrawal
b972a2a88d swh.lister.cgit
Implemented a lister to list the repos for a given CGit instance.

Closes T1659
2019-06-28 19:27:25 +05:30
Archit Agrawal
a9a37a85bf swh.lister.cran
Add a lister to list all the CRAN packages .
It uses the build-in API in R language to list the packages
and get their metadata. 

Closes T1709
2019-06-11 21:26:31 +05:30
Archit Agrawal
151f6cd223 swh.lister.gnu
Implement first pass of gnu lister to list all the
packages present in https://ftp.gnu.org/
Add GNU lister in README and cli.py

Closes T1722
2019-06-08 21:56:00 +05:30
David Douard
d6169c7141 cli: register the 'lister' cli subcommand
also add a cli group named 'lister' for the sake of consistency with
other swh packages and rename the command as 'db-init', like:

  swh lister db-init LISTER [...]
2019-05-22 13:36:57 +02:00
archit
fedfd73c8e swh.lister.phabricator
Add a lister of all hosted repositories on a Phabricator instance
Closes T808
2019-05-15 19:54:33 +05:30
Antoine Lambert
977d2459c3 Remove references to no more used lister_db_url conf entry 2019-05-13 15:18:56 +02:00
Antoine Lambert
8ffd8dadef cli: Fix initialization for all listers
Prior to this commit, initializing all listers was failing after the
debian lister processing because of global insert_minimum_data init

Related T1629
2019-04-10 11:33:55 +02:00
Antoine R. Dumont (@ardumont)
262c297a5e
lister.cli: Fix spelling typo 2019-02-14 10:47:28 +01:00
David Douard
0720c8e12e Kill (useless) --create-tables and --with-data cli command options 2019-02-06 15:38:11 +01:00
David Douard
4f2580a97c Make the --lister option of the cli tool a variadic argument
and add a 'all' possibel value for it, so that one can initialize all the
database for all listers at once.
2019-02-06 10:19:26 +01:00
David Douard
72658ff720 cli: fix debian lister so it also uses config overrides 2018-12-20 15:01:18 +01:00
Antoine Lambert
ffe4ac9a3c swh.lister.npm: Add an incremental npm lister
This new lister enables to get only new or updated npm packages
since the last listing operation.

Related T1378
Closes T1398
2018-12-03 17:58:27 +01:00
Antoine Lambert
605a67f51d swh.lister.npm : Add a lister of all available packages in the npm registry
Related T1378
Closes T1380
2018-11-26 11:04:13 +01:00
Antoine R. Dumont (@ardumont)
ed64d24634
pypi.lister: Normalize pypi name to PyPI
Related T422
2018-09-14 13:24:48 +02:00
Antoine R. Dumont (@ardumont)
6ff3b90859
swh.lister.pypi: Add a pypi lister implementation using xmlprc api
Based solely on pypi's deprecated xmlrpc api [1].  No other way of listing
pypi.org is referenced (except for parsing an html page through a
legacy api [2])

[1] https://warehouse.readthedocs.io/api-reference/xml-rpc/#pypi-s-xml-rpc-methods

[2] https://pypi.python.org/simple/

Related T422
2018-08-01 10:25:21 +02:00
Antoine R. Dumont (@ardumont)
b6b588dbbb
lister.cli: Insert optional flag to permit post insert data
That's needed for example for having the minimum necessary to make the
debian lister run.
2018-07-27 11:13:13 +02:00
Antoine R. Dumont (@ardumont)
2c69b586bc
Revert "lister.cli: Insert optional flag to permit post insert data"
This reverts commit e61512afc7.

This commit contains one section not supposed to be there yet (pypi is
not ready)
2018-07-27 11:12:01 +02:00
Antoine R. Dumont (@ardumont)
e61512afc7
lister.cli: Insert optional flag to permit post insert data
That's needed for example for having the minimum necessary to make the
debian lister run.
2018-07-27 11:06:22 +02:00
Antoine R. Dumont (@ardumont)
726d45b182
swh.lister.cli: Factorize supported listers 2018-07-27 10:21:38 +02:00
Antoine R. Dumont (@ardumont)
6c54b64a8f
swh.lister.cli: Add debian lister to the list of supported listers 2018-07-27 10:19:48 +02:00
Antoine R. Dumont (@ardumont)
63cff5b337
lister.cli: Fix broken imports 2018-07-17 15:48:48 +02:00
Antoine R. Dumont (@ardumont)
4c4aa0ead2
swh.lister: Make LISTER_NAME a class attribute
swh.lister.gitlab: make the 'instance' a constructor parameter
2018-07-11 17:43:41 +02:00
Antoine R. Dumont (@ardumont)
581028cfc5
swh.lister.cli: Fix cli docstring 2018-07-11 15:56:33 +02:00
Antoine R. Dumont (@ardumont)
3e62bc867e
swh.lister.cli: Simplify cli 2018-07-11 09:45:51 +02:00
Antoine R. Dumont (@ardumont)
afcd6997c4
swh.lister.cli: Add a basic cli to deal with create/drop db actions 2018-07-03 15:49:52 +02:00