Giter Club home page Giter Club logo

forward's Introduction

Rawr! ๐Ÿ‘‹

Commit stats

forward's People

Contributors

akkornel avatar eigenstate avatar mckenziephagen avatar neutrinonerd3333 avatar raphtown avatar salcc avatar sohams-mass avatar vsoch avatar zqfang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

forward's Issues

add check for previous notebooks

It can get confusing if the user has already submit a notebook (or other) with the same job - I'm going to add a check that advises the user to end or resume, given one is already running.

slow sherlock response leads to port forwarding failure

In the past day or so, me and a couple colleagues have been having a difficult time connecting to Sherlock using forward. when the script gets to the setup_port_forwarding, it fails with this error mux_client_forward: forwarding request failed: Port forwarding failed muxclient: master forward request failed or this error Access denied by pam_slurm_adopt: you have no active jobs on this node Authentication failed.

I believe this is because Sherlock seems to be connecting slowly, meaning that the port isn't ready when the script gets to that line. Adding sleep 30 to the start.sh script right before setup_port_forwarding seems to have fixed it.

Should this be a part of the main script, or could it be added to the Debugging part of the read me? It's not always necessary, just when Sherlock is "acting up", and it does add time to the setup process, which isn't ideal for when Sherlock isn't lagging.

Allow for hosts other than sherlock

The proposed addition of the hosts directory in #8 is making me think we could pretty straightforwardly add the ability to use hosts other than sherlock. This would make this utility applicable to any cluster using the SLURM scheduler.

julia+jupyter notebook

We have request / need from a user for a Jupyter notebook running Julia. The setup (if done natively) is pretty annoying so I'm going to give this a shot with a container. This might also be good opportunity to test adding another resource (farmshare) although I'm not certain yet. Feel free to assign me to this.

resume.sh issue with ssh usage

After attempting to resume a session using

$ bash resume.sh sherlock/py3-jupyter

I am prompted for my password and after two-step verification this appears:

$ ssh [-1246AaCfGgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
           [-D [bind_address:]port] [-E log_file] [-e escape_char]
           [-F configfile] [-I pkcs11] [-i identity_file]
           [-J [user@]host[:port]] [-L address] [-l login_name] [-m mac_spec]
           [-O ctl_cmd] [-o option] [-p port] [-Q query_option] [-R address]
           [-S ctl_path] [-W host:port] [-w local_tun[:remote_tun]]
           [user@]hostname [command]

then prompts me for an input.

argument parsing for sbatch scripts

The scripts are getting interesting enough that the user should be able to provide actual arguments so we can be more specific. Eg.., instead of:

bash start.sh <sbatch> <container> <directory>

they should be able to do:

bash start.sh <sbatch> --image=<container> --notebook-dir=<directory>

That way, ordering doesn't matter either, and we can have more optional arguments.

Running notebook w/ python 3

Hello! I wrote a notebook on my local in python 3 and want to run it on Sherlock. Once running the notebook in Sherlock, I don't see a way to change the kernel (I tried ml python=3.6.1, but it did not work). What do you suggest? Thank you in advance!

Recent start.sh issues

As of lately, when I run start.sh it starts up as normal then gives me the following output:

== Waiting for job to start, using exponential backoff ==
Attempt 0: not ready yet... retrying in 1..
Attempt 1: not ready yet... retrying in 2..
Attempt 2: resources allocated to sh-28-06!..
sh-28-06
sh-28-06
notebook running on sh-28-06

== Setting up port forwarding ==
ssh -L 56432:localhost:56432 sherlock ssh -L 56432:localhost:56432 -N sh-28-06 &
mux_client_forward: forwarding request failed: Port forwarding failed
muxclient: master forward request failed
[email protected]'s password: == Connecting to notebook ==
# automatically continues which doesn't allow me to input my password


== View logs in separate terminal ==
ssh sherlock cat /home/users/rzawadzk/forward-util/py3-jupyter.sbatch.out
ssh sherlock cat /home/users/rzawadzk/forward-util/py3-jupyter.sbatch.err

== Instructions ==
1. Password, output, and error printed to this terminal? Look at logs (see instruction above)
2. Browser: http://sh-02-21.int:56432/ -> http://localhost:56432/...
3. To end session: bash end.sh sherlock/py3-jupyter

DN525eol:forward royzawadzki$ Permission denied, please try again.
[email protected]'s password: 
# automatically continues which doesn't allow me to input my password
Permission denied, please try again.
[email protected]'s password: 
# automatically continues which doesn't allow me to input my password
[email protected]: Permission denied (gssapi-with-mic,password).

At the moment, when I get to this message and I try refreshing the link in the browser, the link is dead. If I run resume.sh however, the webpage becomes active.

open source license

I would like to suggest adding an open source license (MIT or BSD3, or something in the GPL family) before we open this up to contributions from others (and myself!) Here are a list of options --> https://opensource.org/licenses. If you let me know which is your preference, I'd be glad to add in a PR.

resource for sherlock!

hey @raphtown ! I saw your email on the list, and wanted to say how great this is. This is a really cool resource, and I'd like to propose that I can test it out (notice I'm part of Research Computing at Stanford!) and then write up a little instructional on our lessons page. I'm thinking I might be able to containerize some of the steps to make it easier, but I haven't given it a try yet. If we are able to get everything working, I can PR to this repo with updates, and then do a little writeup (and I'd ask for your contribution if you are willing!) Let me know your thoughts - looking forward to trying it out.

USERNAME variable not changing

Hi, when attempting a log in to a Sherlock node it seems to be using my local USERNAME instead of the one initialized in the params.sh file. Im on Mac OS Catalina 10.15, I tried editing the start.sh file to hard code in my Sherlock username but it still is referencing my local USERNAME

[sbatch] tensorflow

I'm going to give a go at trying this out for tensorflow with gpu - I think this would be very useful for like, everyone, lol.

ControlPath can be too long

I've was using forward to run Jupyter notebooks on Sherlock, using the default ssh config in hosts/sherlock_ssh.sh, and recently encountered an SSH Error: ControlPath too long.

I discovered that it was because the default ssh config includes the full hostname in the ControlPath for the Sherlock connection. I was at SLAC, and the hostname my machine was assigned was surprisingly long. I fixed this by replacing the ControlPath name %l%r@%h:%p with %C, which gives a hash of %l%h%p%r (see ssh_config manpage).

(Soโ€ฆ pr maybe?)

Access denied by pam_slurm_adopt: you have no active jobs on this node

Hi Vanessa,
Thank you for creating this tool on sherlock. I was following the instructions at https://vsoch.github.io/lessons/sherlock-jupyter/ and I run into the following issue. I'm hoping you can help me understand the problem. When I run start.sh I get the following errors, and when I try to open the notebook in my browser (using the following address) it fails. But the job is running.

[tdaley@sh-ln08 login /scratch/PI/whwong/tdaley/programs/forward]$ bash start.sh py3-jupyter /scratch/PI/whwong/tdaley/sgRNA/CRISPRa-sgRNA-determinants/deepLearningMixtureRegression/
== Finding Script ==
Looking for sbatches/sherlock/py3-jupyter.sbatch
Script sbatches/sherlock/py3-jupyter.sbatch

== Checking for previous notebook ==
No existing py3-jupyter jobs found, continuing...

== Getting destination directory ==

== Uploading sbatch script ==
py3-jupyter.sbatch 100% 146 29.6KB/s 00:00

== Submitting sbatch ==
sbatch --job-name=py3-jupyter --partition=whwong --output=/home/users/tdaley/forward-util/py3-jupyter.sbatch.out --error=/home/users/tdaley/forward-util/py3-jupyter.sbatch.err --mem=20G --time=8:00:00 /home/users/tdaley/forward-util/py3-jupyter.sbatch 58668 "/scratch/PI/whwong/tdaley/sgRNA/CRISPRa-sgRNA-determinants/deepLearningMixtureRegression/"
Submitted batch job 34562816

== View logs in separate terminal ==
ssh sherlock cat /home/users/tdaley/forward-util/py3-jupyter.sbatch.out
ssh sherlock cat /home/users/tdaley/forward-util/py3-jupyter.sbatch.err

== Waiting for job to start, using exponential backoff ==
Attempt 0: not ready yet... retrying in 1..
Attempt 1: not ready yet... retrying in 2..
Attempt 2: not ready yet... retrying in 4..
Attempt 3: not ready yet... retrying in 8..
Attempt 4: not ready yet... retrying in 16..
Attempt 5: not ready yet... retrying in 32..
Attempt 6: resources allocated to sh-08-13!..
sh-08-13
sh-08-13
notebook running on sh-08-13

== Setting up port forwarding ==
ssh -L 58668:localhost:58668 sherlock ssh -L 58668:localhost:58668 -N sh-08-13 &
Access denied by pam_slurm_adopt: you have no active jobs on this node
Authentication failed.
== Connecting to notebook ==
[I 18:10:27.968 NotebookApp] Writing notebook server cookie secret to /tmp/jupyter/notebook_cookie_secret
[I 18:10:29.512 NotebookApp] Serving notebooks from local directory: /scratch/groups/whwong/tdaley/sgRNA/CRISPRa-sgRNA-determinants/deepLearningMixtureRegression
[I 18:10:29.512 NotebookApp] 0 active kernels
[I 18:10:29.512 NotebookApp] The Jupyter Notebook is running at: http://localhost:58667/
[I 18:10:29.512 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
slurmstepd: error: *** JOB 34562525 ON sh-08-13 CANCELLED AT 2018-12-22T18:14:40 ***

== View logs in separate terminal ==
ssh sherlock cat /home/users/tdaley/forward-util/py3-jupyter.sbatch.out
ssh sherlock cat /home/users/tdaley/forward-util/py3-jupyter.sbatch.err

== Instructions ==

  1. Password, output, and error printed to this terminal? Look at logs (see instruction above)
  2. Browser: http://sh-02-21.int:58668/ -> http://localhost:58668/...
  3. To end session: bash end.sh py3-jupyter

[tdaley@sh-ln08 login /scratch/PI/whwong/tdaley/programs/forward]$ jobs
34562816 whwong py3-jupy tdaley R 2:51 1 sh-08-13

Thank you for your help and I apologize if I missed something super obvious.

Add input argument to control custom python envs

First off, this is great and much better than what I had made for myself.

Second, something that I thought was nice was the option to specify a conda environment to activate before running the notebook. I've implemented something basic, but I don't know it is something that is good for general use.

Trouble with Command Line Arguments to Launch Jupyter Notebooks into the Proper Directory

From here the way to launch jupyter notebooks would on your local computer to do something like bash start.sh <software> <path>. I've been attempting to play around with the argument, but all my permutations lead me into a page with only one folder forward-util with three files for py3-juptyer.sbatch, py3-jupyter.sbatch.err, and py3-jupyter.sbatch.out.

My situation is that I have a directory on my local computer with two subdirectories: the cloned directory (forward) and another directory with my .ipynb files called BPA. I cd into forward to start up sherlock with the following commands and outcomes:

  • bash start.sh sherlock/py3-jupyter ../BPA with the message directory not found and a page with the forward-util folder

  • the same thing above also happens when I put the absolute path

  • After moving the BPA directory into the forward directory and running bash start.sh sherlock/py3-jupyter BPA no error about the directory not being found this time, but still launches into the `

What is the proper syntax to get the files I want onto the jupyter notebooks page? Is it that the path is the path on the actual server?

Running a notebook: "ssh: Could not resolve hostname"

I have followed all the steps detailed here. When I get to the step where you start a notebook using bash start.sh jupyter /path/to/dir it doesn't connect, giving me a notice that it "could not resolve hostname." Here's what I'm running and the output (for context my SUNetID is rzawadzk):

DN52ekoj:forward royzawadzki$ bash start.sh sherlock/py3-jupyter ../BPA
== Finding Script ==
Looking for sbatches/rzawadzk/sherlock/py3-jupyter.sbatch
Looking for sbatches/sherlock/py3-jupyter.sbatch
Script      sbatches/sherlock/py3-jupyter.sbatch

== Checking for previous notebook ==
ssh: Could not resolve hostname rzawadzk: nodename nor servname provided, or not known
No existing sherlock/py3-jupyter jobs found, continuing...

== Getting destination directory ==
ssh: Could not resolve hostname rzawadzk: nodename nor servname provided, or not known
ssh: Could not resolve hostname rzawadzk: nodename nor servname provided, or not known

== Uploading sbatch script ==
ssh: Could not resolve hostname rzawadzk: nodename nor servname provided, or not known
lost connection

== Submitting sbatch ==
rzawadzk sbatch --job-name=sherlock/py3-jupyter --partition=normal --output=/forward-util/py3-jupyter.sbatch.out --error=/forward-util/py3-jupyter.sbatch.err --mem=20G --time=8:00:00 /forward-util/py3-jupyter.sbatch 59258 "../BPA"
ssh: Could not resolve hostname rzawadzk: nodename nor servname provided, or not known

== View logs in separate terminal ==
ssh rzawadzk cat /forward-util/py3-jupyter.sbatch.out
ssh rzawadzk cat /forward-util/py3-jupyter.sbatch.err

== Waiting for job to start, using exponential backoff ==
ssh: Could not resolve hostname rzawadzk: nodename nor servname provided, or not known
Attempt 0: not ready yet... retrying in 1..
ssh: Could not resolve hostname rzawadzk: nodename nor servname provided, or not known
Attempt 1: not ready yet... retrying in 2..
ssh: Could not resolve hostname rzawadzk: nodename nor servname provided, or not known
Attempt 2: not ready yet... retrying in 4..

And so on with the attempts. I'm not sure what's going on here.

Contributing guidelines

In spirit of the license, we should add a CONTRIBUTING.md. I can take a first shot at this and show you a suggestion.

remove port specification

I don't see any logic for asking the user to pre-generate a custom port, and then giving an error message if/when another notebook (by the same user) is opened. It would make more sense to generate a port on the fly, and risk that another user is using it.

I'll add this tweak to my current PR.

Add a Changelog

To the extent that we can (somewhat) keep track of changes (and it seems overkill to have tags/releases, but could eventually be done) we should minimally have a list of changes, and the associated Github commits when they are added can (again, somewhat) trace to a version.

default partition

Feature request: I'm a Sherlock nube but was immediately drawn to this repo because I use Jupyter a lot. I got Jupyter working on Sherlock by using these scripts, which is awesome. (Thank you!) The only real hiccup was that it took me a long time to find out that I should set my default partition to normal, instead of rdror or drorlab or whatever the default value is. Would there be a way to change the default, or at least print out that normal is an option for users whose PIs don't buy a dedicated partition?

(A more correct view of my issue may be that the problem is with Sherlock documentation instead of this repo but I figured I would start here first.)

Trouble logging in to launch jupyter notebook

Hi,

I'm attempting to run a jupyter notebook through sherlock on my local machine (Windows 10), following the instructions here:
https://vsoch.github.io/lessons/sherlock-jupyter/

I believe that I have successfully cloned the forward repo, generated the parameters file, and created the ssh credentials (all on my local computer). I then ran into a couple issues:

1. Issue creating password for jupyter notebook
I tried to create a password for the notebook using the following code from the $HOME folder:

$ sdev
$ ml python/3.6.1
$ ml py-jupyter/1.0.0_py36
$ which jupyter /share/software/user/open/py-jupyter/1.0.0_py36/bin/jupyter
$ jupyter notebook password

which resulted in the following error message:

$ which jupyter /share/software/user/open/py-jupyter/1.0.0_py36/bin/jupyter
/share/software/user/open/py-jupyter/1.0.0_py36/bin/jupyter
/share/software/user/open/py-jupyter/1.0.0_py36/bin/jupyter
$ jupyter notebook password
Enter password:
Verify password:
Traceback (most recent call last):
  File "/share/software/user/open/py-jupyter/1.0.0_py36/bin/jupyter-notebook", line 11, in <module>
    sys.exit(main())
  File "/share/software/user/open/py-jupyter/1.0.0_py36/lib/python3.6/site-packages/jupyter_core/application.py", line 267, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/share/software/user/open/py-jupyter/1.0.0_py36/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/share/software/user/open/py-jupyter/1.0.0_py36/lib/python3.6/site-packages/notebook/notebookapp.py", line 1362, in start
    super(NotebookApp, self).start()
  File "/share/software/user/open/py-jupyter/1.0.0_py36/lib/python3.6/site-packages/jupyter_core/application.py", line 256, in start
    self.subapp.start()
  File "/share/software/user/open/py-jupyter/1.0.0_py36/lib/python3.6/site-packages/notebook/notebookapp.py", line 345, in start
    set_password(config_file=self.config_file)
  File "/share/software/user/open/py-jupyter/1.0.0_py36/lib/python3.6/site-packages/notebook/auth/security.py", line 148, in set_password
    config.NotebookApp.password = hashed_password
  File "/share/software/user/open/python/3.6.1/lib/python3.6/contextlib.py", line 89, in __exit__
    next(self.gen)
  File "/share/software/user/open/py-jupyter/1.0.0_py36/lib/python3.6/site-packages/notebook/auth/security.py", line 131, in persist_config
    with io.open(config_file, 'w', encoding='utf8') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/users/ilow/.jupyter/jupyter_notebook_config.json'

I was able to get around this issue (I think) with the following:
$ mkdir ~/.jupyter

2. Issue logging in to launch notebook
I next tried to launch a notebook, following the instructions in Part 3 of the above tutorial:
$ bash start.sh py3-jupyter /home/users/ilow
which resulted in a strange loop where I kept entering my password, 2-factor authenticating, and then it would ask for my password/2FA again:

== Finding Script ==
Looking for sbatches/sherlock/py3-jupyter.sbatch
Script      sbatches/sherlock/py3-jupyter.sbatch

== Checking for previous notebook ==
mux_client_request_session: read from master failed: Connection reset by peer
[email protected]'s password:
Duo two-factor login for ilow

Enter a passcode or select one of the following options:

 1. Duo Push to XXX-XXX-5751
 2. Phone call to XXX-XXX-5751
 3. SMS passcodes to XXX-XXX-5751

Passcode or option (1-3): 1
ControlSocket /c/Users/ilow1/.ssh/[email protected]:22 already exists, disabling multiplexing
No existing py3-jupyter jobs found, continuing...

== Getting destination directory ==
mux_client_request_session: read from master failed: Connection reset by peer
[email protected]'s password:

Any idea why this would happen or how to remedy? I'm new to Sherlock and any advice would be much appreciated! Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.