Giter Club home page Giter Club logo

dds_cli's People

Contributors

aanil avatar alneberg avatar dependabot[bot] avatar erikdanielsson avatar ewels avatar github-actions[bot] avatar i-oden avatar matthiaszepper avatar rv0lt avatar senthil10 avatar snyk-bot avatar talavis avatar valyo avatar worukan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dds_cli's Issues

Add flags and extra options to config file

If we're going to have a "config" file, i.e. the one that the username etc are saved to atm (.dds-cli.json), there was a suggestion to add the possibility of entering the flags (e.g. destination etc) and --source in it as well. Reduces long lines in the terminal and makes it easier for the user to check and change the options in some cases.

Listing folder globs gives SQL error

Testing the dds ls command line tool I managed to get this:

$ dds ls fac002 */_old

 ︵ (  )   ︵
(  ) ) (  (  )   SciLifeLab Data Delivery System
 ︶  (  ) ) (    https://www.scilifelab.se/data
      ︶ (  )    Version 0.2.0


INFO     Listing files for project 'fac002'                                                                                                                  data_lister.py:124
INFO     Showing files in folder '*/_old'                                                                                                                    data_lister.py:126
ERROR    Failed to get list of files: (pymysql.err.OperationalError) (1139, "Regex error 'quantifier does not follow a repeatable item at offset 1'")           __main__.py:265
         [SQL: SELECT DISTINCT files.subpath AS files_subpath
         FROM files
         WHERE files.project_id = binary(%(binary_1)s) AND (files.subpath regexp %(subpath_1)s)]
         +)+$'}]
         (Background on this error at: http://sqlalche.me/e/14/e3q8)

I haven't dug into the code to figure out what this really means yet, but getting a SQL error is slightly scary (I'm assuming that little Bobby Tables will be safe here..? 👀 ).

Anyway, would be nice to enable file path globbing when listing files if possible, or catching this type of error if not (and just returning a files-not-found response).

`ls` sort option

Add option to be able to sort projects according to either of the fields?

List whole directory structure

At the moment the cli dds ls [projectID] command lists the root level folders and files. Add possibility of listing entire directory structure and use the pagination functionality.

Stream processing and upload

User story: As a user, I want to upload my protected data, without first having to use space on my local computer.

Connected to #121 as well.

Change delivery report format

Currently the delivery report is saved to a json file, with important information if something should fail. Change format from json perhaps? Can be discussed at some point.

Log version

It's a good idea to always log the version of the tool that's running. Also to have a --version flag that does only this.

As an added bonus, you can also attempt to fetch the git hash if the script is in a repository and print that (I do that for MultiQC - it's pretty useful when multiple people are working with a dev version).

Weird `--break-on-fail` bug

When using --break-on-fail and an error occurs, the cursor disappears in the terminal. Need to use top to get the cursor back, or restart the terminal. Why?

Continue failed upload

User story: As a user, I want to be able to continue a failed upload if anything goes wrong.

Log errors to file straight away

At the moment the CLI waits until the end of the upload/download to add error information to the error log file. The errors and important information should be saved to the file straight away, when the error occurs.

Fix glitchy progress bars

Since I messed around with the logging code, the progress bar used for uploads and downloads now jumps whenever a log message is printed, instead of elegantly dropping down below each new message.

This is because the log handler and the progress bar are using different rich Console objects. It should be possible to fix by sharing the same Console between both. See Textualize/rich#1317 (comment) for an example.

I think that to do this, it maybe makes sense to create a Console object that can be returned from a utils module or something for easy reused between disparate parts of the codebase. But whatever makes sense really.

Make a decision on 2FA or MFA to offer to users

Two or multi factor authentication helps strengthen the security of the system by means of additional factors to identify users. They also help with the problem of strong passwords and maintaining/remembering them. We would like to improve both the security and the user experience in the system.

DDS_METHODS not used?

DDS_METHODS = ["ls"]

Just doing a bit of code review to get into the code. I saw this was added recently in 1d99e6f, but it doesn't seem to be used anywhere? I'm not sure it's needed either, I guess click should already keep track of which commands are allowed? But maybe I'm wrong?

Python best practices - system exits & logging

System exits

Doing a hard system exit is an absolute last resort. Normally you want to be raising an exception instead, then probably having a single location right at the top of the code to catch this and do the system exit with non-zero code.

This is especially important if other code packages are importing and using the functions. System exits cannot be caught, so for example if used in the dds_web code then these will crash the server. Exceptions can be caught and handled differently depending on where the function is being used.

In most cases here I think it will make sense to create your own exception types. You can then pass the error message in the raise statement and log it before exiting downstream (eg. as done here).

I use this pattern a lot in nf-core and typically do the exit call in the command-line handling code. eg. here. So then if you have 5 subcommands you'll probably have a maximum of 5 exit calls. You can be pretty sure that no-one else will be importing and reusing your cli handling code.

In some cases you have an exit code 0 because it's something like just not having any files to show etc - in other words, normal behaviour. Here you should probably just use return to drop out of the function execution early without an exception.

Logging

You almost never want to print to the console using a rich Console.

One reason is that this prints to standard out - but most of the current usage is log / status messages. Normally, these should be going to standard error and only the real "results" (eg. the list of projects etc) should go to standard out. This means that command line users can split the output types for downstream use.

Much like the system exits / exceptions, the console calls are not useful for other tools importing the functions. It's better to use the logging library instead - then the log messages can be assigned to a namespace and their output customised by any tool using the function (eg. only showing errors or being channeled to a web server log).

Rich has a logging handler so you can keep the command line outputs looking identical. Here's how I use it in nf-core. My implementation is slightly more complex, as I give the option of also logging to a file, without rich. I also enable highlighting / rich syntax and have a function that basically makes the colours show up in GitHub Actions CI tests.

Create parsable log

Some analytics may be wanted from the log files. For this feature we first need to decide on what info should be saved to those files.

File link handling

How to resolve and not resolve file links within and outside of specified folders. Includes saving link information to database and creating links on download.

Download files to already existing directory

Currently need to specify a new directory name when downloading. Ability to append to an already existing directory should be added. Make sure only the recently downloaded files can be deleted though.

Add upload --destination option

Add the --destination option to dds put so that the end user can specify which remote existing or new folder the items should be placed in during upload.

`rm` throws error about non-existing `warn_if_many` if file does not exist

Tried to remove a file which did not exist and got:

data_lister.DataLister.warn_if_many(count=len(not_exists) + len(delete_failed))
AttributeError: type object 'DataLister' has no attribute 'warn_if_many'

The rm previously listed the files which where not successfully removed within the system, but warned if it was too many. The pagination should be added here in the same way as for in the data_lister.py

Change project size update

@inaod568 commented on Mon Jun 14 2021

At first - project size updated after each uploaded file. This produced deadlock issues when uploading a lot of small files since it tried to update the same project table field at the same time in multiple requests.

Now - Updates the project size at the end of the upload. This means that if an error occurs during the upload, the project size is not updated.

Fix: Either add a queue (for example) to the API and update the db after each file, or add the project size update to the cleanup after failed upload.

Create project command

As a unit admin and personnel, I want to create a project via the CLI.

This involves the CLI and the endpoints.

Check username before password prompt

When specifying the wrong username and no password, the CLI still asks for the password. Suggestion to check if the username is correct before prompting.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.