Comments (6)
By its nature, a git filter can only act on the commit about to be made. It's not designed to "spawn" another commit in a different branch (I suppose it could be made to do that with some trickery, but I rather wouldn't like to implement such a "hack" in nbstripout).
You can however emulate the behaviour you want by installing nbstripout only in a .gitattributes
file in your release
branch. Assuming your "dirty" notebooks are on master, you can create a clean branch as follows:
Create an orphan release branch and install nbstripout in that branch only:
git checkout --orphan release
nbstripout --install --attributes=.gitattributes
git add .gitattributes
git commit -m 'Install nbstripout'
You can then cherry-pick notebook commits as follows (you will probably have to do this in order to avoid merge conflicts, unless each commit is entirely self contained):
git cherry-pick --no-commit
git commit -a --no-edit
From my quick test those 2 stages are necessary for the filter to kick in i.e. if you just do a plain cherry pick the filter is not applied.
Hope this helps!
from nbstripout.
Presumably, a variant of
nbstripout
could also be used to add a git filter that would automatically run a notebook when commiting it to a repository to ensure that all its output cells are populated?
Not easily: as mentioned, npstripout uses a git clean/smudge filter and operates purely on the file level. No cells are ever executed.
You would need to look at a pre-commit hook, however I expect that's not too easy to set up: you'd need to start a notebook server, run the notebook and deal with failures. This would also take very long.
If you only want to verify the output is populated, that's easier to do (and you could potentially reuse some of nbstripout
's code for that).
from nbstripout.
@kynan Thanks for that - will give it a try. git
is still a bit voodoo to me; I need to clear some time and try to get a proper understanding of how it works and also clarify in my own mind exactly what sort of process I want to implement.
For generating newly run notebooks, could that be done elsewhere in a Github managed repository, eg using CI hooks to run something to create the new notebooks? (Apols - this is going off-topic for nbstripout
, I'm thinking aloud through my fingers...)
from nbstripout.
There's another option I didn't think of earlier: you can use the git filter-branch
approach described in the README
.
By creating "new" notebooks, do you mean creating stripped versions from "full" versions? Or the other way round?
You presumably could use CI hooks to automate either variant, but I don't have anything to suggest since I haven't tried anything of that sort.
If you haven't come across https://mybinder.org before, I wonder if that could be a starting point.
from nbstripout.
@psychemedia have you found a suitable workflow for your needs?
from nbstripout.
@kynan I've actually moved to a workflow around jupytext now that uses a text based representation for notebooks (no cell outputs).
Reflecting back, I think that a git filter-branch --tree-filter
approach would probably work okay for a release: create new branch, run the git filter to clean all the notebooks in it, commit.
Here's another example of that approach: rewriting the contents of a branch as text files using jupytext
; in a branch, run:
git filter-branch --tree-filter 'jupytext --to md */*.ipynb && rm -f */*.ipynb' HEAD
from nbstripout.
Related Issues (20)
- `--dry-run` should exit non-0 if files would be updated HOT 1
- Should be agnosting on trailing blank lines HOT 9
- [Feature Request] Process Folders (Batch / Bulk) HOT 1
- Doesn't strip out pycharm metadata HOT 9
- New release HOT 2
- Replace cram with prysk HOT 2
- Read config from `setup.cfg` HOT 1
- Option to error on cell outputs exceeding `--max-size` HOT 4
- It is recommended to remove pytest-runnner from setup_requires in setup.py HOT 4
- Strip output_type=stderr only, with keep_output? HOT 4
- Possible nbstripout-fast integration HOT 6
- Prevent committing notebooks with errors in cell outputs HOT 6
- Specifying Python executable path in `nbstripout --install`
- 'nbstripout' is not recognized as an internal or external command, operable program or batch file. HOT 2
- Not compatible with `pre-commit-hooks/pretty-format-json` hook HOT 4
- `git config filter.nbstripout.extrakeys ` support for `attachments`? HOT 4
- Support setting defaults for command line arguments via git config
- No valid notebook detected HOT 4
- required = true by default or make doc more explicit about it HOT 5
- Support git-filter-repo HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nbstripout.