kokes / nbviewer.js Goto Github PK
View Code? Open in Web Editor NEWClient side rendering of Jupyter notebooks
License: MIT License
Client side rendering of Jupyter notebooks
License: MIT License
There is currently no handling of errors in the github repo, there will be two classes of these:
We might even report what's remaining of the github limit (60/hour) to the user, it's reported in the headers of our HTTP payload.
Style all the elements in JavaScript, there's only a few directives, that way we can avoid a dependency and make the library into a single file.
I presume the performance penalty for styling each element instead of document-wide application is negligible.
The github viewer should validate inputs, but these need to be identical to what github allows username/company and repo names to be.
Try 1/0
in a cell and then render here, it won't work.
We need to handle output type: error
.
{
"cell_type": "code",
"execution_count": 180,
"metadata": {},
"outputs": [
{
"ename": "ZeroDivisionError",
"evalue": "division by zero",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mZeroDivisionError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-180-05c9758a9c21>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;36m1\u001b[0m\u001b[0;34m/\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mZeroDivisionError\u001b[0m: division by zero"
]
}
],
"source": [
"1/0"
]
},
I presume those are accessible via the API
This predates (and inspires) the JetBrains dataset I've been using:
We could perhaps improve how tables are rendered, especially if they are DataFrames (alternating colours, white and gray, bold indices).
Originally reported here, but we also don't support this.
I created a reproducer (see branch ansi), but haven't looked into a fix yet.
Microsoft forked this for their Azure offering. Go through their changelog and see if there are things we could merge in.
They kept the file structure mostly intact, so we could just diff and see it right there.
Both Prism and Marked are seriously outdated.
The live demo link on the README.md page shows a 404 page.
Instead of stdout
, we sometimes get stderr
, which should be handled appropriately.
{
"cell_type": "code",
"execution_count": 946,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
Currently, we emit an error (but continue on) here
// name in v4, stream in v3
if ( (dt.name || dt.stream) != 'stdout' || dt.output_type != 'stream')
console.error('unexpected stream spec');
We should
To reproduce, try erring in Python (not fatally), trying something deprecated (e.g. writing to a view in pandas)
Example: warning.ipynb.txt
There is a number of elements in the header: instructions, examples, inputs, disclaimers, ... We should separate these out to some groups and make sure that the interface is clearer.
I downloaded a sample from this piece of research by JetBrains link, it will be useful for testing.
It's a public S3 bucket, so it was as easy as aws s3 sync s3://github-notebooks-update1/ data/
(need to Control-C it, there's a lot of data)
Originally posted by @kokes in #48 (comment)
(Updating to Sierra, I managed to destroy my local brew/node toolkits, so I can't set it up now.)
bokeh has very specific MIME in ipynb, see https://github.com/bokeh/jupyter_bokeh/blob/master/src/renderer.ts#L39
should we just hotload the js dependency if we see this MIME and then eval it?
we'd be starting a whole new era of potentially dangerous javascript evaluation
When both text/plain
and text/html
are supplied, both are rendered. But it seems that the former is just a fallback format in case the latter can't be rendered.
Check the spec if there are any other substitutes like this.
This is a great tool! I came here when my notebooks weren't rendering on github,
but this is so much quicker! I wonder if this could be turned into a chrome extension?
Is anyone doing this?
The tool is getting some traction, so it might deserve its own domain.
As Github now allows rendering from the master repo, let's unify it for easier maintenance.
I think the current layout is quite awkward and my css knowledge at the time predated all the cool and usable stuff introduced in the past five years
note: check browser usage in GA, maybe we have to support IE/Edge
Cells with text/html
are not properly indented (vertically) and the Out [n]
indicator is weirdly placed most of the time, the In [n]
one as well, but less so.
For the cells, make their box settings common across all to avoid this. For the execution count indicators... improve the absolute positioning.
There seem to be javascript issues with encoding/decoding base64 data with non-latin characters.
There are many things that could go wrong, add UI-visible error reporting:
This one is really broken in nbviewer.js
https://github.com/jupyter/nbformat/blob/master/nbformat/tests/test3.ipynb
This relates to Prism.js (our current highlighter). Shall we hotload language support from a CDN or should we just bundle the most popular languages (Python, Julia, R, Haskell?).
The former makes the viewing tool leaner, but does not allow for offline use and involves a lag if language spec isn't cached. The latter allows for compact definitions and fewer HTTP calls since you can create bundles on the Prism website. But since these are custom, we'd need to vendor this. And there'd be no support for other languages, unless we hotloaded them.
And that might be a good compromise - to bundle the most popular languages and hotload any unsupported ones.
(A good way to survey language popularity in Jupyter would be to search BigQuery, which has all the code and reportedly hundreds of thousands of notebooks.)
Two things to consider.
Our currently used library, marked.js, has an open pull request to support all this.
I found some error when I rendered .ipynb like this.
It should be
In this commit 8098772
I replace
dm.setAttribute('src', 'data:' + fmt + ';base64,' + dt.data[fmt].join(''));
to
dm.setAttribute('src', 'data:' + fmt + ';base64,' + dt.data[fmt]);
It works fine.
When I print the dt.data[fmt]
, it shows a base64 url.
I think the origin one is right.
If there is something that I've missed, please notify me
Currently dependent on Marked and Prism.js, but ideally modular with respect to both. The Markdown converter should be trivial (assuming it doesn't break potential LaTeX support), the code highlighter might be dependent on some HTML (e.g. Prism needs a language name in the <code>
block).
Since it's quite hard to build proper tests (see #33), we can start by having non-automated visual tests. What I envisage the workflow to be:
What we need in terms of infrastructure:
There is a clear explanation of the spec on the nbformat page. There's also a JSON spec for both v3 and v4. There's a few things we're not supporting - collapsed, scrolled, errors, some mime types (especially in v3).
Steps now
eval
, or is it?console.error
at least, possibly in the DOM as well.There is basic wrapping implemented in 1aa9045, but I'd like it to be extended to all pre
elements - well, all but code inputs, which should overflow: scroll.
Also, make sure that the current CSS implementation is the most compatible one, there seems to be some contention about that.
There have been multiple requests to support rendering the file upon a double click.
An initial implementation is in /cmd/, it's a Go file that embeds all the necessary JS/HTML/CSS, saves it to a temporary folder, alongside the clicked file. Then opens the browser.
One of the reasons for this is that Github's Jupyter rendering does not work on mobile Safari for some reason. There isn't anything preventing us from supporting it, this is more about explicit CSS directives to make it usable there.
Preferably
I had a quick demo in README.md, inlined as a gif. Not sure where it went, but should be reinstated at some point.
When viewing a notebook generated by Literate.jl, only the first cell is displayed. Opening their example notebook using the online preview shows the problem.
This is the same issue as nbviewer-app issue 6, where @tuxu tells me that in "the first output cell, [] nbviewer.js expects an array, but only a string literal is found".
As discovered in #28, we sometimes render multiple cell outputs, but we should always render just one. There is an ordering defined in nbconvert, so we should check each when parsing the data
field.
Before implementing this
Option in settings
. Render TOC based on h1-h6. Do not presume that a given Markdown library creates relevant header IDs.
Track the nbformat changelog to make sure we haven't missed any new features.
I'm not sure what the best course of action here is. Whether we should just care about DOM generation or if we should go full "screenshot testing" - using Selenium or Puppeteer.
The main reason is to test things like overflows in pre
s and img
s, multiple outputs, legacy versions etc. Things that tend to break with changes to the library code.
Please add this. Priority: high.
There is a reveal-backed support for slides in nbconvert.
I haven't looked into the implementation, not sure how it's built.
Data we get from the Github API is base64 encoded. Sadly, there are issues with b64 in JavaScript, notably with non-latin characters. There are ways of resolving this, but these involve several dependencies that we would like to avoid.
Another solution would be to submit a second AJAX request to the "true" URL of the target file and download it directly.
There are quite a few fields that may be plain strings or arrays of strings - we handle them in various ways - either via typeofs or Array.isArray. We should have a helper that consumes a variable of type string|array[string] and outputs a string. That should solve pretty much all of these cases.
Last time this surfaced was in #44.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.