Giter Club home page Giter Club logo

Comments (29)

jpeacock29 avatar jpeacock29 commented on July 23, 2024 1

This issue arises in git generally because of how internal and external diff tools are handled. As explored here (for a similar case but with meld rather than nbdime as the external diff driver) external diff drivers compare the working tree file with a smudged repo file, while internal diff drivers compare the cleaned working tree file with the repo file. From that thread, it doesn't appear to be currently possible to override this in git.

A flag in nbdiff would be great, but as a stopgap I'm wrapping nbstripout around git-nbdiffdriver and then using the script as the new diff driver:

#!/bin/bash

# pass arguments through to git-nbdiffdriver, but pass 
# a stripped version of the working tree file
git-nbdiffdriver diff $1 $2 $3 $4 <(cat $1 | nbstripout) $6 $7

Making this file executable, adding it to the path and then adding it as a new diff driver will do the trick.

from nbstripout.

timtroendle avatar timtroendle commented on July 23, 2024 1

I repeated the steps from my comment above using nbdime 1.0.5 and nbstripout 0.3.4 and the difference is that nbdiff now works straight out of the box. git diff and nbdiff still do show different things which is slightly confusing. For me atm, this is good enough.

from nbstripout.

chris-rands avatar chris-rands commented on July 23, 2024 1

Thanks for all the thoughts on this, did anyone find a solution that means git diff works out the box as expected with both nbdime and nbstripout installed?

from nbstripout.

kynan avatar kynan commented on July 23, 2024

Can you be a bit more specific how you configured nbdime and nbstripout?

I haven't used nbdime in anger yet but did a simple test and found them to interact without any issue:

git init foo && cd foo
nbstripout --install
git-nbdiffdriver config --enable  # without --global
git-nbmergedriver config --enable  # without --global
# Create notebook foo.ipynb
git add foo.ipynb
git diff --cached  # nbstripout kicking in, output stripped as expected
git commit -m init
# Make some changes to the notebook
git diff  # nbstripout kicking in, output stripped as expected
git difftool --tool nbdime  # diff shown in nbdime format, output *not* stripped

Note that in this setup nbstripout's configuration is written to .git/info/attributes whereas nbdime's configuration is written to .gitattributes.

This is the behavior I'd expect. Do you get / expect something different?

from nbstripout.

kynan avatar kynan commented on July 23, 2024

I've also tried written both configs to .gitattributes and they don't seem to conflict. That's what I expect since nbdime installs a diff and merge driver whereas nbstripout installs a clean and smudge filter.

from nbstripout.

timtroendle avatar timtroendle commented on July 23, 2024

Sorry for being not precise enough here. The behaviour you are describing is what I'd expect, but not what I see. Your git diff # nbstripout kicking in, output stripped as expected does use nbdime I assume?

For me nbstripout isn't working anymore as soon as nbdime is active. git diff shows changes in a notebook using nbdime, but outputs aren't stripped out. git add even puts the output onto the staging area -- and git commit commits them eventually. So that's not what I want.

As this is working for you, apparently it's a config issue on my side. I cannot spot any particularities in my setup though. It's purely repo local with gitattributes looking like this

*.ipynb filter=nbstripout

*.ipynb diff=jupyternotebook

and .git/config looking like this

[...]
[filter "nbstripout"]
	clean = ~/bin/miniconda3/envs/dev/bin/python ~/bin/miniconda3/envs/dev/lib/python3.5/site-packages/nbstripout.py
	smudge = cat
	required = true
[diff "jupyternotebook"]
	command = git-nbdiffdriver diff

Anyway, this being a config issue, feel free to close.

from nbstripout.

kynan avatar kynan commented on July 23, 2024

Your git diff # nbstripout kicking in, output stripped as expected does use nbdime I assume?

Aha, I had forgotten that I have an alias for git diff --no-ext-diff which I normally use (and which shows a "normal" unified diff in this case). A regular git diff does indeed use nbdime.

Regardless, git add should still respect the nbstripout filter and in your commit you should not get the output. Can you confirm this?

To summarise:

  • use git diff to get an nbdime diff with output
  • use git diff --no-ext-diff to get a diff with the output stripped by nbstripout

So what you really want is an nbdime diff with the output stripped by nbstripout?

from nbstripout.

timtroendle avatar timtroendle commented on July 23, 2024

So what you really want is an nbdime diff with the output stripped by nbstripout?

Yes, that's what I would like to have. It's nice to be able to use them side by side, but in the best case one could use them at the same time.

Regardless, git add should still respect the nbstripout filter and in your commit you should not get the output. Can you confirm this?

Yes I can confirm that, I just tried it. Slightly weird caveat though is that as soon as you have staged some changes, git diff --cached will use nbdime and show the changes including the output. This might be a Git particularity, but it's really confusing, because you would expect that whatever is staged (= whatever is shown by git diff --cached) would be commited. But that's not the case in this setup.

from nbstripout.

kynan avatar kynan commented on July 23, 2024

OK, I'm not sure how to best get nbdime and nbstripout to cooperate nicely. Maybe the nbdime devs have suggestions? @vidartf @martinal @minrk

you would expect that whatever is staged (= whatever is shown by git diff --cached) would be commited

Does git diff --no-ext-diff --cached not work in this case?

from nbstripout.

minrk avatar minrk commented on July 23, 2024

I'd have to look into the relationship of filters and drivers when both are active. nbdiff commands should get a flag for ignoring outputs, metadata, etc. nbshow has such flags, but nbdiff does not.

from nbstripout.

vidartf avatar vidartf commented on July 23, 2024

A flag in nbdime is now available (v. 0.3.0) (either use -s/--sources to only show source diff, or use -O/--ignore-outputs to exclude outputs from diff display).

from nbstripout.

jpeacock29 avatar jpeacock29 commented on July 23, 2024

Thanks, vidartf! However, I'm not sure either of these flags would solve OP's problem since neither appears to equivalent to nbstripout, which removes execution counts, output and metadata. For example, even with the -s flag, I'm still getting diffs for execution counts, although this may be a bug. Are these flags documented anywhere? I couldn't find much.

from nbstripout.

vidartf avatar vidartf commented on July 23, 2024

I don't think the flag will solve the problem on its own, but it should prevent the need for the stopgap diff driver wrapper.

The execution count is not really either source, metadata or output, so it got left out of the initial fix. It could either be included with metadata flag, or get its own flag (which could also include other uncovered fields like nbformat/nbformat_minor). The only documentation for the flags currently is the CLI --help text.

from nbstripout.

jpeacock29 avatar jpeacock29 commented on July 23, 2024

Maybe I'm missing something, but since nbdime flags aren't equivalent to nbstripout, won't the diff given by nbdime still be incorrect? (That is, not reflective of what will be committed because of nbstripout filter.)

from nbstripout.

kynan avatar kynan commented on July 23, 2024

Any update on this @timtroendle et. al.?

from nbstripout.

timtroendle avatar timtroendle commented on July 23, 2024

@kynan apologies for being silent for quite a while. I am not using notebooks at the moment, gave it a quick try, but cannot make nbstripout work right away in a fresh install: I actually don't see any diff at the moment. Not sure what's wrong but I don't have the time at the moment to dig deeper.

from nbstripout.

vidartf avatar vidartf commented on July 23, 2024

I've just opened a PR on nbdime (jupyter/nbdime#335) which seeks to ease integration with git filters. One problem that cropped up is that git diff drivers / tools cannot reliably determine whether it is diffing a blob or a file from the working tree, i.e. if it should apply the filters or not. Due to this, it needs to be hidden behind a flag (that defaults to off). For nbdiff with git references it should work as expected. See the PR for details. Any feedback is welcome!

from nbstripout.

kynan avatar kynan commented on July 23, 2024

@vidartf I can see your PR was merged.

@timtroendle Does this resolve your issue?

from nbstripout.

timtroendle avatar timtroendle commented on July 23, 2024

I am afraid to say that this doesn't seem to solve the issue -- at least not with my setup and without manual tweaking. Here is what I do:

  • I have nbdime 1.0.2 and nbstripout 0.3.3 installed
  • I have a fresh Git repo with one commit including a notebook and a change in the notebook
  • git diff shows me execution count and output and no nbdime magic -- as expected
  • I run nbdime config-git --enable and now the diff shows me the nbdime magic, but still including execution count and output -- as expected
  • I run nbstripout --install and now the diff shows me no output or execution count, but no nbdime magic either

Instead, I'd hope to see the nbdime diff but without anything that is stripped out by nbstripout.

from nbstripout.

kynan avatar kynan commented on July 23, 2024

Thanks for the update. I hope to find time to take a look.

from nbstripout.

vidartf avatar vidartf commented on July 23, 2024

Instead, I'd hope to see the nbdime diff but without anything that is stripped out by nbstripout.

That is what you should be seeing. This seems to me as a config issue, as nbstripout also configures diff drivers. Is there a difference if you call nbdiff instead of git diff? What if you perform manual config for nbstripout, but skip the diff parts?

from nbstripout.

timtroendle avatar timtroendle commented on July 23, 2024

Is there a difference if you call nbdiff instead of git diff?

nbdiff throws an error:

 FileNotFoundError: [Errno 2] No such file or directory: '"~/miniconda3/envs/nbdime/bin/python3.6"': '"~/miniconda3/envs/nbdime/bin/python3.6"'

(The file exists.)

What if you perform manual config for nbstripout, but skip the diff parts?

If I skip the last line of the manual config (*.ipynb diff=ipynb), git diff shows me nbdime, but nothing stripped out. If I do add the line, git diff shows me everything stripped out, but not nbdime.

However, if I run nbdiff without that last line, I do see exactly what I was looking for: output is stripped, and the diff is shown as per nbdime. Also, committing the changes does consider nbstripout.


So this is almost what I think would be great, but still has confusion from the difference in what git diff shows and what gets committed eventually (which is what nbdiff shows as well).

from nbstripout.

vidartf avatar vidartf commented on July 23, 2024

nbdiff throws an error:

Would you mind including the full stack trace? You can put it in a block like this to make it collapsible (nice github trick):

<details><summary>Full trace</summary>
<pre>
 -- put your trace here --
</pre>
</details>
Full trace
 -- put your trace here -- 

Could you try to add the argument --use-filter to the git diff driver config for nbdime?

from nbstripout.

vidartf avatar vidartf commented on July 23, 2024

PS: Please note the caveat in this comment in nbdime for why --use-filter is not enabled by default.

from nbstripout.

timtroendle avatar timtroendle commented on July 23, 2024

Here's the stack trace when running nbdiff in a repo set up automatically as described in my post from August 6.

Full trace Traceback (most recent call last): File "/Users/trtim/miniconda3/envs/nbdime/bin/nbdiff", line 11, in sys.exit(main()) File "/Users/trtim/miniconda3/envs/nbdime/lib/python3.6/site-packages/nbdime/nbdiffapp.py", line 128, in main return main_diff(arguments) File "/Users/trtim/miniconda3/envs/nbdime/lib/python3.6/site-packages/nbdime/nbdiffapp.py", line 39, in main_diff for fbase, fremote in changed_notebooks(base, remote, paths): File "/Users/trtim/miniconda3/envs/nbdime/lib/python3.6/site-packages/nbdime/gitfiles.py", line 160, in changed_notebooks entry.b_path, entry.b_blob, ref_remote, repo_dir) File "/Users/trtim/miniconda3/envs/nbdime/lib/python3.6/site-packages/nbdime/gitfiles.py", line 110, in _get_diff_entry_stream ret = apply_possible_filter(path) File "/Users/trtim/miniconda3/envs/nbdime/lib/python3.6/site-packages/nbdime/vcs/git/filter_integration.py", line 80, in apply_possible_filter stderr=STDOUT, shell=USE_SHELL File "/Users/trtim/miniconda3/envs/nbdime/lib/python3.6/subprocess.py", line 336, in check_output **kwargs).stdout File "/Users/trtim/miniconda3/envs/nbdime/lib/python3.6/subprocess.py", line 403, in run with Popen(*popenargs, **kwargs) as process: File "/Users/trtim/miniconda3/envs/nbdime/lib/python3.6/subprocess.py", line 709, in __init__ restore_signals, start_new_session) File "/Users/trtim/miniconda3/envs/nbdime/lib/python3.6/subprocess.py", line 1344, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: '"/Users/trtim/miniconda3/envs/nbdime/bin/python3.6"': '"/Users/trtim/miniconda3/envs/nbdime/bin/python3.6"'

from nbstripout.

vidartf avatar vidartf commented on July 23, 2024

@timtroendle I opened a separate issue for the exception here: jupyter/nbdime#415

from nbstripout.

kynan avatar kynan commented on July 23, 2024

@vidartf @timtroendle I can see jupyter/nbdime#415 was fixed in 1.0.3.

Does that solve the issue? Can we close this?

from nbstripout.

vidartf avatar vidartf commented on July 23, 2024

Thanks for following this up. Feel free to open issues on the nbdime repo if there are any remaining actionable steps to take to further improve the cooperation between these tools! :)

from nbstripout.

kynan avatar kynan commented on July 23, 2024

I've not tried this in a while. @chris-rands could you provide reproduction steps i.e. what are you doing? What are you seeing? What would you expect / hope to see?

Ideally file a new issue for that.

from nbstripout.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.