Comments (11)
This may well be possible with executablebooks/MyST-Parser#47
from myst-nb.
Having looked at a scrapbook, I don't think it does quite what we want. But actually, using the same kind of mechanics, its super easy to achieve this 'content recording'. Basically paste this code into a notebook, execute and save:
def get_mimetypes(obj):
if hasattr(obj, "_repr_mimebundle_"):
return obj._repr_mimebundle_()
mimebundle = {}
for mimetype, method in (
("text/plain", "__str__"),
("text/html", "_repr_html_"),
("application/json", "_repr_json_"),
("image/jpeg", "_repr_jpeg_"),
("image/png", "_repr_png_"),
("image/svg+xml", "_repr_svg_"),
("text/latex", "_repr_latex_"),
):
if hasattr(obj, method):
mime_content = getattr(obj, method)()
if mime_content is not None:
mimebundle[mimetype] = getattr(obj, method)()
return mimebundle
def record_outputs(obj, key, metadata=None):
from IPython.display import display
mimebundle = get_mimetypes(obj)
if not mimebundle:
raise ValueError("No mimebundle available")
metadata = metadata or {}
metadata.update({"record_key": key})
display(
{"recorded/" + k: v for k, v in mimebundle.items()},
raw=True,
metadata=metadata,
)
a = "abc"
record_outputs(a, "mytext")
import pandas as pd
record_outputs(pd.DataFrame([1, 2, 3]), "mytable")
If you look at the notebook JSON, you'll see the outputs have all been saved, along with the keys in the metadata:
{
"cells": [
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"recorded/text/plain": "abc"
},
"metadata": {
"record_key": "mytext"
},
"output_type": "display_data"
},
{
"data": {
"recorded/text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>0</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1</td>\n </tr>\n <tr>\n <th>1</th>\n <td>2</td>\n </tr>\n <tr>\n <th>2</th>\n <td>3</td>\n </tr>\n </tbody>\n</table>\n</div>",
"recorded/text/plain": " 0\n0 1\n1 2\n2 3"
},
"metadata": {
"record_key": "mytable"
},
"output_type": "display_data"
}
]
}
]
}
From this its then super easy to query for record keys, given a reference like {scrap}`mytext`
(at least using an SQL database 😆)
from myst-nb.
@choldgraf, following our discussion, I looked at the scrapbook code, and I see how they do it now, which with my example above, you can literally condense to a couple of lines of code.
def glue(obj, name, display=False):
import IPython
from IPython.display import display as ipy_display
mimebundle, metadata = IPython.core.formatters.format_display_data(obj)
mime_prefix = "" if display else "application/papermill.record/"
metadata["scrapbook"] = dict(name=name, mime_prefix=mime_prefix)
ipy_display(
{mime_prefix + k: v for k, v in mimebundle.items()},
raw=True,
metadata=metadata,
)
a = "abc"
glue(a, "mytext")
import pandas as pd
glue(pd.DataFrame([1, 2, 3]), "mytable", display=True)
It just seems a lot of overhead to depend on a separate package for such a simple function.
from myst-nb.
@chrisjsewell yea that makes sense to me...so we only want a tiny fraction of what scrapbook offers (aka, "only try to store the display information of an object, or store the raw object values if it's a text object")
I'm less-concerned with which technical stack that we use, and more concerned with what kinds of behavior we'd ask users to take in order to use our stack. The main reason I was suggesting scrapbook was because it's the only pre-existing API I know of for "store some information inside the notebook to be re-used later" but perhaps we can piggy-back off of that pattern while using a much-simplified stack?
from myst-nb.
Yeh, as I see it the primary functionality is: "store the output mime-bundles of any object with a unique key identifier, in a consistent format for later querying" (text objects here are treated the same as any other object), and the potential secondary functionality is: "store this data without displaying it in the frontend notebook" (this is where you need to add a prefix to the mime type).
You can obviously do that with scrapbook; but then I fear you make the whole of your stack dependent on scrapbook; e.g. you would need to read in all notebooks with the scrapbook reader, rather than the standard nbformat one.
from myst-nb.
@chrisjsewell I've been thinking about this a bit more - do you imagine that this should be in jupyter-cache
? I think myst-nb
would need to have the logic if "given a notebook path, and given a role like {scrap}
(maybe {paste}
?), return the value according to the {paste}
key that was given". But the question is: what's the thing that does the glueing in the first place?
from myst-nb.
You could do something like process the stored outputs for these stored scraps, at commit time, then store them in the DB commit record, for fast lookup. However, a complication is that the cache can store multiple versions of a notebook which would lead to key clashes, and the cached notebooks don't necessarily relate to what notebooks are being used in the sphinx build.
Therefore, I think this may be better handled in the MyST-NB parser:
-
When we write the notebook here: https://github.com/ExecutableBookProject/MyST-NB/blob/bde00319a669e51bdfa24f30411f1d4052358b25/myst_nb/parser.py#L133 also find and add these scaps to a 'database' file:
- JSON would be simplest, as long as there is no issues with read/write access that is better handled by SQL).
- before adding we would wipe any previously associated with that docname
- We would also need to check and report on any key clashes
-
Then, when we want to find these scraps (probably in a transform), query this DB file for the required outputs.
from myst-nb.
so do you imagine something like:
# Or maybe `from myst_nb import cache_variable`?
from jupyter_cache import cache_variable
a = 2
cache_variable("var_a", a)
ax = plt.plot([2], [3])
cache_variable("cool_plot", ax.figure)
and it'd store the mimebundle outputs in the notebook metdata (maybe jupyter-cache: {}
)
For a simplest implementation, we could do something like:
-
At parse time, check the
ipynb
file for anything injupyter-cache:
and add it to a dictionary that spans all notebooks. Something likenotebooks_cache = { "path_ntbk1": {"key": "val"}, "path_ntbk2": {"key": "val"} }
-
Expose a role like
The value of my variable is {paste}`path_ntbk1:key`
that will search for that notebook and retrieve the key. And a shorthand{paste}`key`
that will search all the notebooks and return the proper value only if there is only one instance of the key. In this case it always tries to render the value as a__repr__
style output. -
Expose a directive like:
```{paste} path_to_notebook:key ```
And it inserts a
CellNode
andCellOutputBundleNode
at this location in the doctree, with the output bundle coming from that key.
Over time, this could be modified to use a smarter cacheing mechanism than a dictionary, but I'm just imagining a quick working prototype. WDYT?
from myst-nb.
and it'd store the mimebundle outputs in the notebook metdata
It would probably just store it in the outputs, but with a prefix on the mimetype, so that its ignored by general renderers (as scrapbook does)
from myst-nb.
Ah yeah, and then as we loop through the cells, check the outputs for a mimetype meant for caching and then store it?
from myst-nb.
For a simplest implementation, we could do something like:
But yeh thats the general idea
from myst-nb.
Related Issues (20)
- Corners of cells don't render correctly
- Ipywidgets don't render HOT 1
- Images in notebook are rendered too large HOT 2
- Incorrect "show/hide code cell output" background in dark mode HOT 2
- Tag a new release HOT 5
- `eval` role sometimes returns no outputs if previous cell ends with semicolon
- path is deprecated. HOT 4
- could you make a release ? HOT 17
- Code-blocks reference and glossary myst directives not working as expected HOT 4
- Tests which use the `sphinx_run` fixture are broken for `sphinx>7`; failing CI results HOT 1
- Add `append_css` config variable to docs HOT 1
- 1.0.0: pytest is failing in 4 units HOT 2
- 1.0.0: documentation build fails HOT 5
- Tables from code-output not properly rendered HOT 3
- GitHub Release and changelog entry for version 1.0 HOT 4
- Add support for custom kernel provision
- Fail to evaluate notebooks when compiling Jupyter Book: no running kernel HOT 2
- `%glue` magic command
- Using include directive, conf.py configuration is not picked up HOT 2
- Does not work with Sphinx's internationalization workflow HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from myst-nb.