Giter Club home page Giter Club logo

nbgitpuller's Introduction

GitHub Workflow Status - Test Documentation Status GitHub Discourse Gitter

nbgitpuller lets you distribute content in a git repository to your students by having them click a simple link. Automatic merging ensures that your students are never exposed to git directly. It is primarily used with a JupyterHub, but can also work on students' local computers.

See the documentation for more information.

Installation

pip install nbgitpuller

Example

This example shows how to use the nbgitpuller link generator to create an nbgitpuller link, which a user then clicks.

  1. The nbgitpuller link generator GUI is used to create a link.

  2. This link is clicked, and the content is pulled into a live Jupyter session.

nbgitpuller's People

Contributors

a3626a avatar albertmichaelj avatar brian-rose avatar carreau avatar choldgraf avatar consideratio avatar cristiklein avatar danlester avatar debbieyuster avatar dependabot[bot] avatar fperez avatar fvd avatar georgianaelena avatar jdmansour avatar jgwerner avatar jrdnbradford avatar manics avatar mathbunnyru avatar minrk avatar ogrisel avatar parmentelat avatar pre-commit-ci[bot] avatar ryanlovett avatar saladraider avatar sean-morris avatar sigurdurb avatar snozzberries avatar yuvipanda avatar znicholls avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nbgitpuller's Issues

More examples required when using branches

I found that when I switched to a different branch in the repo I was using, I needed to specify it as:

origin/<branch name>

because there was a Git error if I just used <branch name>:

'gitpuller https://github.com/<repo details> <branch> materials' exited with 1: fatal: ambiguous argument '<branch>..origin/<branch>': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

Blank page: fails to load xterm.js

I installed nbgitpuller following the instructions in the README, and when I access an nbgitpuller URL, I just end up at a blank page. Looking at the browser console shows that xterm.js is failing to load (both the JS and the CSS). My guess is that this is related to the updated version of xterm.js included with jupyterhub/notebook v5.5.0.

I can get it to load the JS by changing index.js:4 to 'static/components/xterm.js/index.js'.

Then there is an error in the vendored _fitTerm() function due to changes in xterm.js. I changed it to:

    GitSyncView.prototype._fitTerm = function() {
        // Vendored in from the xterm.js fit addon
        // Because I can't for the life of me get the addon to be
        // actually included here as an require.js thing.
        // And life is too short.
        var term = this.term;
        if (!term.element.parentElement) {
            return null;
        }
        var parentElementStyle = window.getComputedStyle(term.element.parentElement),
            parentElementHeight = parseInt(parentElementStyle.getPropertyValue('height')),
            parentElementWidth = Math.max(0, parseInt(parentElementStyle.getPropertyValue('width')) - 17),
            elementStyle = window.getComputedStyle(term.element),
            elementPaddingVer = parseInt(elementStyle.getPropertyValue('padding-top')) + parseInt(elementStyle.getPropertyValue('padding-bottom')),
            elementPaddingHor = parseInt(elementStyle.getPropertyValue('padding-right')) + parseInt(elementStyle.getPropertyValue('padding-left')),
            availableHeight = parentElementHeight - elementPaddingVer,
            availableWidth = parentElementWidth - elementPaddingHor;

        rows = parseInt(availableHeight / term.renderer.dimensions.actualCellHeight);
        cols = parseInt(availableWidth / term.renderer.dimensions.actualCellWidth);

        term.resize(cols, rows);
    };

I don't know if these are the best fixes, but they seem to work. I haven't corrected the CSS URL, though, as I'm not quite sure where that is now.

Error: undefined caused by opening multiple notebooks simultaneously

Here's a new error that I saw for the first time when trying to open a notebook:

error_undefined

What I did that triggered this error: I went to http://data8.org/sp17/ and double-clicked on three links to three notebooks in rapid succession, causing each one to open in a separate tab. (I opened the lec16, lec17, and lec18 demos notebooks, i.e., Fri 2/24, .) This created three tabs with links to git-pull URLs. The first tab loaded the notebook. The next two tabs triggered the above error message.

This is very minor and low-impact. If I just re-click the link, everything works. But this makes me suspect there is something failing if a user tries to simultaneously do multiple pull/sync operations, and failing in a way that displays an unhelpful error message. So a small improvement could be to display a more helpful error message, or to write the system to wait a random amount of time and try again.

Automagically resolve conflicts upon remote deletion/moves

Scenario:

  • Instructor creates repo with file A/b
  • Student nggitpulls repo
  • Student edit A/b
  • Instructor git rm A/b (variant: git mv A/b A/c)
  • Student nbgitpulls latest version

This currently results in a merge conflict. Instead, in the general spirit of nbgitpuller some opinionated decision should be made.
Presumably:

  • leave local file untouched (tracked by git? not any more?)
  • warn the user: A/b was removed/renamed remotely; left local copy untouched

(thanks btw for this useful tool!)

Regarding a JupyterLab transition - redirections etc...

Goal

To make links to redirected sensibly while clicking nbgitpuller links from within notebooks...

  • without consideration of jupyter lab or jupyter notebook is showing the notebook
    • UPDATE: The environment variable NBGITPULLER_APP can help to control this behavior.
  • without consideration of linking from a jupyterhub hosted notebook or localhost hosted notebook. Also without consideration of the domain name, to avoid hardcoding www.myhub.com in the in-notebook link.
  • without consideration of where the notebook is located in the file system
    • UPDATE: Hmm too vague formulated point for me to understand myself a year later.

My use case

I'd like to have a document that is always available on my jupyterhub, referencing sources of content to add.

Analysis 1

  • Classic - Opening index.ipynb in the root folder (often /home/jovyan), your URL is updated to mydomain.com/user/username/notebooks/Index.ipynb, but you can also access this by referring to mydomain.com/user/redirect/tree/Index.ipynb
  • Lab - In order to open index.ipynb in the root folder (often /home/jovyan), you should access mydomain.com/user/username/lab/tree/Index.ipynb

Analysis 2

  • The /hub/user-redirect/git-pull? is relevant if you have a jupyterhub hosting only, I think...
    • This currently does not work since the query params seem to be stripped during the user redirect... (SOLVED NOW!)
  • The /git-pull?... link would work if you have a localhost hosting only, i think...

TODO

Currently my best known option is...

Currently, the best solution I have is assume a JupyterHub is hosting the notebook, and using a link like this, ending up with a classic view.

Let app be specified by user

link_generator.ipynb lets people choose between notebook and lab while other nbgitpuller options are free form. We'd like to enable people to specify their own app, e.g. rstudio. Apparently ipywidgets doesn't have a select+other widget so in its absence, could the app selection be free form like all the others? Perhaps notebook and lab could descend to documented choices in the UI.

Change to materials-x18 didn't sync to hub.data8x

I added this notebook to git:

https://github.com/data-8/materials-x18/blob/master/lec/8x.1/Example.ipynb

Then I made an EdX button with the appropriate metadata to open it:

["next=/hub/user-redirect/git-sync?repo=git://reposync/materials-x18&subPath=lec/8x.1/Example.ipynb"]

When I click that button, the Hub claims that it is syncing the repo, but the file never appears.

It seems like syncing isn't working (or, more likely, I'm doing something wrong).

`gitautosync` CLI seems to require being called in a git repo

I am getting the following:

$ gitautosync --git-url https://github.com/data-8/gitautosync
2017-08-31 05:54:53,398] INFO -- Pulling into ./ from https://github.com/data-8/gitautosync, branch master
fatal: Not a git repository (or any parent up to mount point /home/neuro)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
Traceback (most recent call last):
  File "/opt/conda/envs/neuro/bin/gitautosync", line 11, in <module>
    load_entry_point('gitautosync==0.0.1', 'console_scripts', 'gitautosync')()
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/gitautosync/__init__.py", line 165, in main
    args.repo_dir
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/gitautosync/__init__.py", line 75, in pull_from_remote
    yield from self._update_repo()
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/gitautosync/__init__.py", line 94, in _update_repo
    yield from self._reset_deleted_files()
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/gitautosync/__init__.py", line 106, in _reset_deleted_files
    status = subprocess.check_output(['git', 'status'], cwd=self.repo_dir)
  File "/opt/conda/envs/neuro/lib/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/opt/conda/envs/neuro/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['git', 'status']' returned non-zero exit status 128.
$

It looks as though the first thing the gitautosync CLI does is to call git status?

Links not working in Chrome

@samyag1 was having some issues with the links on Chrome with Ubuntu Linux (16.04). Here is his original message.

I've created an interact link using the url-to-interact web app (http://url-to-interact.herokuapp.com/). The link this creates works in the firefox browser, but not in the Chrome browser. Specifically, I can view the source code of the page and it looks fine, but nothing loads - the page just stays blank. This is on Ubuntu Linux (16.04). I'm not sure if I've got my Chrome browser configured in a funky way though.

Original post: https://piazza.com/class/iydkvfggee3145?cid=45

Any suggestions on how to troubleshoot this? I tried the same link on my computer in Chrome and it worked fine.

Clarify untracked file behavior in the docs

hiya folks

the readme says this

  1. If a file exists locally but is untracked by git (maybe someone uploaded it manually), then rename the file, and pull in remote copy.

I am assuming you actually mean that If a file exists locally but is untracked by git (maybe someone uploaded it manually), and the updated commit contains that file, then rename the file, and pull in remote copy., right ?

there would be no reason to mess with local files that are untracked in either the previous nor the new commit, or am I missing something here ?

Add a sphinx site and move documentation there

What do folks think about adding a sphinx site and moving much of the documentation there from the README.md. I think it could make the info more discoverable and also more extendable as more usecases are presented etc.

What do folks think? If people are +1 I can make a simple sphinx PR

Simplify update UX - with a button?

I'm currently developing educational material for my students, and I'm asking them to click a nbgitpuller link over and over as i refine my material.

My goal, without suggesting an optimal solution, is to simplify updates in a way that avoids...

  • the need of utilizing the links again and again in order to update
  • asking the students to use the CLI in some way
  • forcing the author of the educational material to include !gitpuller ... within the repo

I'm thinking it might be nice to have a button to do the same from within a previously nbgitpulled notebook and/or from the tree-view of the folder. I remember there is a plugin that can filter the tree view, so I figure is would be possible make a plugin within the tree view to run an git update using nbgitpuller.

Trailing commas

Trailing commas seem to make Internet Explorer barf (1, 2, 3).

Commit #50d974 gets rid of some trailing commas, but I noticed a few other places in index.js with trailing commas:

If we suspect that trailing commas might be a problem, maybe it's worth getting rid of those other instances of trailing commas as well?

If I had to guess, I'd guess that the one on line 158 could be a syntax error on any Javascript interpreter, not just IE. The rest seem IE-specific. Sigh. IE.

Pointing to a filepath that does not have an extension just downloads the file

If users create an nbgitpuller link to a file that has no extension (e.g. myfile) then the "redirect" step will trigger the browser to simply download the file, and remain stuck on the nbgitpuller loading page. We should ensure that nbgitpuller at least always tries to open the file with the text editor if it isn't ipynb.

Support updating git submodules

I think it would be a useful feature support for git submodules in nbgitpuller.

The support I had in mind was simply calling git submodule init and then git submodule update when updating a repo.

When a repo is updated which uses git submodules, these functions will most likely need to be called. Expecting the not-tech-savvy user to call these functions will cause confusion.

Document how to run this locally

I got this working at some point but can't figure out how to construct the URL now if you're not running on a jupyterhub (this would be useful, e.g., for classes that aren't using a jupyterhub but still want to distribute content with interact-like links.)

LMK if there's a URL that would work in this way and I can make a PR to the docs.

Checkout errors on "did not match any file(s) known to git"

I student is getting the following error:

 68, in pull
    raise e
File "/srv/app/venv/lib/python3.6/site-packages/nbgitautosync/handlers.py", line
 62, in pull
    for line in gas.pull_from_remote():
File "/srv/app/venv/lib/python3.6/site-packages/gitautosync/__init__.py", line 7
5, in pull_from_remote
    yield from self._update_repo()
File "/srv/app/venv/lib/python3.6/site-packages/gitautosync/__init__.py", line 9
4, in _update_repo
    yield from self._reset_deleted_files()
File "/srv/app/venv/lib/python3.6/site-packages/gitautosync/__init__.py", line 1
10, in _reset_deleted_files
    yield from execute_cmd(['git', 'checkout', '--', filename], cwd=self.repo_di
r)
File "/srv/app/venv/lib/python3.6/site-packages/gitautosync/__init__.py", line 4
0, in execute_cmd
    raise subprocess.CalledProcessError(ret, cmd)
subprocess.CalledProcessError: Command '['git', 'checkout', '--', 'lec/actors.cs
v']' returned non-zero exit status 1.

In a terminal:

~/materials-fa17$ git checkout -- lec/actors.csv
error: pathspec 'lec/actors.csv' did not match any file(s) known to git.

I can run git checkout -- . but after the subsequent fetch, the merge fails:

~/materials-fa17$ git merge -Xours origin/master
error: Merging is not possible because you have unmerged files.
hint: Fix them up in the work tree, and then use 'git add/rm <file>'
hint: as appropriate to mark resolution and make a commit.
fatal: Exiting because of an unresolved conflict.

gitpuller errors

I followed the instructions and installed gitpuller locally.
I created a repo here: https://github.com/miramar-labs/jlab-user-env.git
I then created a local test folder on my laptop:
mkdir test1
cd test1
then I tried:
gitpuller https://github.com/miramar-labs/jlab-user-env.git master .

got errors:
gitpuller https://github.com/miramar-labs/jlab-user-env.git master .
$ git fetch

fatal: Not a git repository (or any of the parent directories): .git

Traceback (most recent call last):
File "/Users/acody/miniconda3/bin/gitpuller", line 11, in
load_entry_point('nbgitpuller', 'console_scripts', 'gitpuller')()
File "/Users/acody/nbgitpuller/nbgitpuller/pull.py", line 211, in main
args.repo_dir
File "/Users/acody/nbgitpuller/nbgitpuller/pull.py", line 59, in pull
yield from self.update()
File "/Users/acody/nbgitpuller/nbgitpuller/pull.py", line 168, in update
yield from self.update_remotes()
File "/Users/acody/nbgitpuller/nbgitpuller/pull.py", line 105, in update_remotes
yield from execute_cmd(['git', 'fetch'], cwd=self.repo_dir)
File "/Users/acody/nbgitpuller/nbgitpuller/pull.py", line 41, in execute_cmd
raise subprocess.CalledProcessError(ret, cmd)
subprocess.CalledProcessError: Command '['git', 'fetch']' returned non-zero exit status 128.

any idea what I'm doing wrong?
also tried it from my jupyter singleuser pod .. same thing....

`repo_dir = repo.split('/')[-1]`

my understanding is that if repo is e.g. https://github.com/org/math-course, then the current code assumes all the local files sit under math-course/ in the jupyter server space

my use case is that the root of the repo contents matches the root of the jupyter server (i.e. repo_dir = '.') - (this is closer to how binder works, IIUC)

is it OK to define yet another variable that the URL can define (e.g. toplevel=.) ?
or is there already a provision in the current code for my use case, that I failed to spot ?

thanks !

Support for shallow clones

I'd be interested in nbgitpuller supporting git's shallow-clone (git clone --depth) implementation, which seems to be pretty reliable these days.

(Ref: https://git-scm.com/docs/git-clone#git-clone---depthltdepthgt )

In my use case in the Callysto project we have lots of users on a shared infrastructure, and only so much storage space to go around. We're finding that a slightly more mixed-media medium like Jupyter encourages things like large animations, images, and datasets to get added to the repository. Git by its nature doesn't allow tweaking/editing of those large objects to actually remove the original versions from git history, therefore (without shallow clones) git repositories storing Jupyter content grow monotonically, and reasonably quick.

A shallow clone would allow us to clone the repository, but at a limited depth, thus saving users from consuming a huge amount of disk space to store git history they're not especially interested in.

A couple questions for the project:

  1. Is this a feature you'd be interested in having implemented?
  2. Is there a use case for this to be an optional/configurable feature, or would it makes sense for nbgitpuller to make shallow clones by default?
    • If yes by default, what depth? Would a -depth 1 argument be okay?

Support pulling from a private repo

nbgitpuller is presumably intended for use with public repositories, but is there also a way of pulling files down from a private repo?

Presumably this would require a key adding to the repo URL, ideally set from a read-only account on the repo?

executable file not found error

I see this error when a user logs in and spawns their server on Jupyterhub:

[Warning] Exec lifecycle hook ([gitpuller https://github.com/mcveanlab/tskit-workshop.git master tskit-workshop]) for Container "notebook" in Pod 
"jupyter-geoff4_jhub(3f96c04d-fed4-11e8-833f-fa163ec7319d)" failed - 
error: command 'gitpuller https://github.com/mcveanlab/tskit-workshop.git master tskit-workshop' exited with 126: , message: "OCI runtime exec failed: exec failed: container_linux.go:348: 
starting container process caused \"exec: \\\"gitpuller\\\": executable file not found in $PATH\": unknown\r\n"

Have a --force option on gitpuller

For some use cases it would be nice to be able to specify on the gitpuller command line to pull down the latest version of a notebook from a repo and obliterate any local changes. e.g. a --force option

Should clone relative to directory set by notebook_dir

EDIT: This is not entirely correct. Behavior is more complicated and inconsistent, see below.

I opened (and then closed an issue) recently about cloning into a specific directory. I closed the issue because there is a pull request already for a modification to the extension to allow for cloning into a specific directory specified by the url. Unfortunately, that pull request does not seem to be currently fully functional, and even if it is, I'm pretty sure that it would still clone relative to the home directory.

Current behavior (at least on a mac and on Windows) is that nbgitpuller always clones into the home directory. This is probably fine for a jupyterhub server use case, but for a local installation, most times people will not want the cloning to happen relative to the home directory. One problem is that if I start the server in a subdirectory of home, the cloned directory is out of scope and the directory can't even be opened.

Instead, people would either expect the cloning to happen relative to the starting directory for the jupyter notebook server (avoiding the above out of scope problem, and probably the better solution), or the default starting directory for Jupyter given by the c.NotebookApp.notebook_dir parameter in the config file (though you can still have the out of scope problem if someone launches Jupyter from the non-default directory). Another good (maybe best?) option would be to clone into the current working directory if Jupyter is open and not in the starting directory.

There is an environmental variable set by Jupyter, JUPYTER_SERVER_ROOT that reflects the value of c.NotebookApp.notebook_dir (at least on Mac OS), so it seems like it would be easy to modify nbgitpuller/pull.py to use that variable when setting the path the git pull operation. However, I can't find out how to get the root starting directory (if I launch Jupyter from a terminal with a different directory for example), so I'm not sure how to get that directory. I also can't figure out how to get the current working directory, and that may be much harder.

I unfortunately am completely new to Jupyter extensions, so I don't think I can create a useful pull request, but I'm happy to help in any way I can.

need a way to define final redirect url

my use case is to display jupyter classic or lab inside an iframe (surrounding app is only about table of contents)

in that context, the final redirection url is not dependent on any of the other arguments passed to git-pull but is computed by the app; I did try to take advantage of the existing logic for computing that final url, but had a hard time trying to figure what that logic was about, and in the end could not get that to work for me

so I would argue that this should be configurable as-is by the caller, through e.g a redirectUrl http arg.

Interact link doesn't redirect properly

When I click on an interact link to a notebook, this should take me directly to the notebook after syncing. Instead, I often find that it takes me to the root of my directory tree.

Here's the pattern: if it's the first time that I've clicked on the interact link in a while, it syncs and then redirects me to the top directory tree. If I then click on the interact link a second time, it properly redirects me to the notebook the second time. Thus, this sounds like it might be some kind of authentication-related issue. I've observed this on Firefox and can reproduce this reliably.

I reproduced it with the Firefox Web Console open and took a look at the headers, and here's what I see.

  1. I click on the link http://datahub.berkeley.edu/hub/user-redirect/git-sync?repo=https://github.com/data-8/materials-fa17&subPath=materials/fa17/hw/hw03/hw03.ipynb.

  2. It does a sync.

  3. After the sync, it redirects my browser to the notebook URL (http://datahub.berkeley.edu/user/daw/notebooks/materials-fa17/materials/fa17/hw/hw03/hw03.ipynb, in my example).

  4. My browser fetches that link and gets back a 302 response, which redirects to http://datahub.berkeley.edu/hub/api/oauth2/authorize?client_id=user-daw&redirect_uri=%2Fuser%2Fdaw%2Foauth_callback&response_type=code, presumably for authentication. Crucially, note that the redirect_uri part of the URL is incorrect and failed to include the full URL.

  5. After OAuth authentication completes, my browser gets redirected to http://datahub.berkeley.edu/user/daw/, i.e., the top of the directory tree. Thus, the redirection chain failed to preserve the full URL through the OAuth authentication phase.

Attached screenshots show what I saw in the Web Console.

wrongredir1

wrongredir2

Recover from orphaned index.lock

If a user server is shutdown in the middle of certain nbgitpuller git operations, it will leave behind an index.lock file. When the user attempts to use the extension the next time they run the server, git will fail due to the presence of the lock file.

@yuvipanda suggests automatically deleting stale locks file that haven't been touched in ~ 10 min.

Add url parameter to prevent redirection after syncing

hey

I'm playing with the idea of adding a notebook extension that would leverage the current serverextension

so as a very naive / preliminary idea I took a plain notebook and manually added links that exercise the /git-pull handle

my toy notebook is here
https://github.com/parmentelat/nbh-nbgitpuller/blob/master/python3.ipynb
and the raw version is
https://raw.githubusercontent.com/parmentelat/nbh-nbgitpuller/master/python3.ipynb

what i can see is that

  • when I use the broken link, i.e. one that has an invalid repo= setting, clicking opens a new tab where i can see a terminal dump with the git error
  • when I use a proper link however, things also go in a separate tab, where i fugitively see a progressbar, but then that tab gets redirected to the classic view for my directory;

I take it that if I had mentioned a specific notebook I would have been redirected there, and I get the idea; however in my use case, because I'd like users to be able to check by themselves the details of what happened, I'd like the second case to stop short and to not redirect anywhere; ending up in a terminal session like the first case would be just fine

would that be an acceptable addition to the url API ? like adding redirect=false or similar

Automatically set user.name & user.email if necessary

my first tests of nbgitpuller were made in a docker container from docker-stacks/dockerhub

I just wanted to outline that in this context, git is not configured - specifically wrt user.email and user.name, and that caused my calls to git to hang

I would suggest to check for that in the code of nbgitpuller with idioms similar to

git config user.name  || git config --global user.name "nbgitpuller user"
git config user.email || git config --global user.email "[email protected]"

which would spare some users the time that figure that one out

turning an existing dir into a git repo

current code assumes that the target student dir

  • either does not exist
  • or it is a git repo

in my use case, for transisioning purposes primarily, I have to consider the case where the directory exists but is not yet a git repo; typically the students have started to work on notebooks obtained via email or other distribution schemes

adding code for taking care of this situation looks easy, and it seems easy to ensure that would not break anything

Branch checkout not working on first click

@yuvipanda
Link I am using: http://datahub.berkeley.edu/user-redirect/interact?account=gunjanbaid&repo=markdown-site-template&branch=gh-pages&path=

The first time I clicl the link:

  • I should see files from the gh-pages branch, which include README.md and index.md.
  • However, I see only README.md, which corresponds to the master branch. index.md does not show up.
  • Running git branch for the folder in JupyterHub terminal shows master.

The second time I click the link:

  • Both README.md and index.md are present, as expected.
  • Running git branch for the folder in JupyterHub terminal shows gh-pages.

Seems like the checkout to gh-pages is not happening correctly on the first click. Repo that I am cloning: https://github.com/gunjanbaid/markdown-site-template

Block insecure ways of pulling from private repositories by default

See comment #85 (comment) for a lot more detail

Since we don't have good support for pulling from private repositories, folks often put their own personal access tokens in the nbgitpuller URL.

This is extremely dangerous, and the same as sharing your password. We should detect and block this, but only after making sure we have some easy way for folks to pull from private repositories.

License missing

Is it on purpose that this repo has no license?

If not can we add one before the number of contributors increases beyond the point where you can contact them all to ask if they are good with the license.

JupyterLab workspaces

If nbgitpuller is used and the user had something running already, this will happen, and after entering a workspace, all the params etc to open a specific notebook are lost.

image

Embed a jupyterhub profile in nbgitpuller

There may be times when particular kinds of content requires either a different environment, hardware, etc. Since JupyterHub has support for user profiles, would it be possible to embed profile information in an nbgitpuller link.

For example, day 1 of a semester the instructor just needs a vanilla analytics environment. They send out nbgitpuller links with profile=basic_profile in the URL.

Mid-way through the course, the instructor starts covering advanced topics in machine learning. They need students to have access to GPUs now (which they weren't using before because of the expense). They send out nbgitpuller links with profile=nvidia in the URL.

Standardize the url query parameters with Binder

There are some url query params that differ between Binder and nbgitpuller, it'd be worth standardizing one or the other. Off the top of my head the main thing I can think of is:

  • subPath=notebooks/path/ in nbgitpuller, and filepath=notebooks/path/ in Binder.

Is there anything else?

Clone into specific directory

Hi,

I'm trying to use nbgitpuller to distribute class material to students running jupyter lab on local machines. The link works fine, but it always clones the repo into my home directory. I have jupyter lab by default open in a different directory, but nbgitpuller always clones into the home directory. Is there something that I'm missing? Is it possible to get nbgitpuller to clone into a different directory, or are there plans to implement this feature?

Thanks you.

undefined error leads to hanging git

gitautosync error with a red status bar, "Error: undefined," and left behind a git process in 'D' state. This was in their gitautosync terminal:

$ git checkout master
Already on 'master'
A       lec/.ipynb_checkpoints/lec07-checkpoint.ipynb
M       lec/lec07.ipynb
Your branch is ahead of 'origin/master' by 31 commits.
  (use "git push" to publish your local commits)
$ git add -A
$ git config user.email "[email protected]"
$ git config user.name "GitAutoPull"
$ git commit -m WIP

I didn't stop their server, but killed the git process.

jovyan@jupyter-user1:~/materials-fa17$ ps aux | grep git
jovyan       22  0.0  0.0  18312  4448 ?        D    21:02   0:00 git commit -m WIP
jovyan       28  0.0  0.0  12960   936 pts/0    S+   21:08   0:00 grep git
jovyan@jupyter-user1:~/materials-fa17$ kill 22
jovyan@jupyter-user1:~/materials-fa17$ ps aux | grep git
jovyan       22  0.0  0.0  18312  4448 ?        D    21:02   0:00 git commit -m WIP
jovyan@jupyter-user1:~/materials-fa17$ kill -9 22
jovyan@jupyter-user1:~/materials-fa17$ ps aux | grep git
jovyan       32  0.0  0.0  12960   984 pts/0    S+   21:09   0:00 grep git
jovyan@jupyter-user1:~/materials-fa17$ git commit -m WIP
fatal: Unable to create '/home/jovyan/materials-fa17/.git/index.lock': File exists.

Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
jovyan@jupyter-user1:~/materials-fa17$ ps aux | grep git
jovyan       35  0.0  0.0  12960   984 pts/0    S+   21:09   0:00 grep git
jovyan@jupyter-user1:~/materials-fa17$ mv .git/index.lock .git/index.lock-renamed
jovyan@jupyter-user1:~/materials-fa17$ git commit -m WIP

After the WIP commit succeeded, invoking git-sync on /user/user1 succeeded.

What was the undefined error? There was no error in the user server log until I killed the 'D' git process, after which subprocess just logged that the git commit command died with SIGKILL.

Could git operations other than lock acquisition be timing out? It did take about 10-15 minutes for the WIP commit to complete.

Re-architect core pulling logic

The current pulling logic (in pull.py) is mostly ported over from nbpuller. While it works great now, it could be simplified and more solidified.

We should move the architecture of that piece of code to a reconciler. It splits the code into two simple parts:

  1. A state recognizer, that does a bunch of stuff to accurately figure out the state of the remote repository & the local git repository
  2. A state reconciler, that performs actions to bring the state of the two repositories closer together

There's probably a word for this design pattern, but I don't know what it is.

This makes it easy to unit test, easy to understand and generally more robust to reason about.

git repository not added to jupyterhub

Hello,
I deployed a jupyterhub on google cloud and used nbgitpuller to pull a git repository to the jupyterhub directory:

postStart:
    exec:
      command: ["gitpuller", "https://github.com/xxxxx/xxxxxxx_notebooks.git", "master", "assessments"]

I expected an assessments directory in jupyter root but is was empty
Is there a problem with my config

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.