kokes / nbviewer.js Goto Github PK

View Code? Open in Web Editor NEW

284.0 11.0 35.0 1.92 MB

Client side rendering of Jupyter notebooks

License: MIT License

JavaScript 8.94% HTML 58.96% Go 0.70% Shell 0.25% Jupyter Notebook 31.15%

jupyter-notebook nbviewer

nbviewer.js's People

Contributors

Stargazers

Watchers

nbviewer.js's Issues

GH: Error handling/reporting

There is currently no handling of errors in the github repo, there will be two classes of these:

General network errors
Going over the API limit

We might even report what's remaining of the github limit (60/hour) to the user, it's reported in the headers of our HTTP payload.

Inline CSS

Style all the elements in JavaScript, there's only a few directives, that way we can avoid a dependency and make the library into a single file.

I presume the performance penalty for styling each element instead of document-wide application is negligible.

GH: validate inputs

The github viewer should validate inputs, but these need to be identical to what github allows username/company and repo names to be.

output type: error

Try 1/0 in a cell and then render here, it won't work.

We need to handle output type: error.

  {
   "cell_type": "code",
   "execution_count": 180,
   "metadata": {},
   "outputs": [
    {
     "ename": "ZeroDivisionError",
     "evalue": "division by zero",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mZeroDivisionError\u001b[0m                         Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-180-05c9758a9c21>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;36m1\u001b[0m\u001b[0;34m/\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mZeroDivisionError\u001b[0m: division by zero"
     ]
    }
   ],
   "source": [
    "1/0"
   ]
  },

Github rendering - support gists

I presume those are accessible via the API

use older notebooks archive for testing

This predates (and inspires) the JetBrains dataset I've been using:

https://github.com/activityhistory/jupyter_on_github

Nicer rendering of tables

We could perhaps improve how tables are rendered, especially if they are DataFrames (alternating colours, white and gray, bold indices).

ANSI colours not supported

Originally reported here, but we also don't support this.

I created a reproducer (see branch ansi), but haven't looked into a fix yet.

This is what it should look like eventually:

testing: pin deps the same way we do in viewer.html

make sure our versions match
add package-lock.json

Incorporate changes from Microsoft's fork

Microsoft forked this for their Azure offering. Go through their changelog and see if there are things we could merge in.

They kept the file structure mostly intact, so we could just diff and see it right there.

Update external dependencies

Both Prism and Marked are seriously outdated.

Live demo link throws 404

The live demo link on the README.md page shows a 404 page.

stderr handling

Instead of stdout, we sometimes get stderr, which should be handled appropriately.

{
   "cell_type": "code",
   "execution_count": 946,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [

Currently, we emit an error (but continue on) here

        // name in v4, stream in v3
        if ( (dt.name || dt.stream) != 'stdout' || dt.output_type != 'stream')
            console.error('unexpected stream spec');

We should

Make sure we don't err here (instead of != 'stdout', we should have ~'not in (stdout, stderr)')
Colour the output in red

To reproduce, try erring in Python (not fatally), trying something deprecated (e.g. writing to a view in pandas)

Example: warning.ipynb.txt

GH: UI

There is a number of elements in the header: instructions, examples, inputs, disclaimers, ... We should separate these out to some groups and make sure that the interface is clearer.

Download more samples and test them

I downloaded a sample from this piece of research by JetBrains link, it will be useful for testing.

It's a public S3 bucket, so it was as easy as aws s3 sync s3://github-notebooks-update1/ data/ (need to Control-C it, there's a lot of data)

Originally posted by @kokes in #48 (comment)

Supply a minified version

(Updating to Sierra, I managed to destroy my local brew/node toolkits, so I can't set it up now.)

bokeh support

bokeh has very specific MIME in ipynb, see https://github.com/bokeh/jupyter_bokeh/blob/master/src/renderer.ts#L39

should we just hotload the js dependency if we see this MIME and then eval it?

we'd be starting a whole new era of potentially dangerous javascript evaluation

Duplicate rendering

When both text/plain and text/html are supplied, both are rendered. But it seems that the former is just a fallback format in case the latter can't be rendered.

Check the spec if there are any other substitutes like this.

Chrome extension?

This is a great tool! I came here when my notebooks weren't rendering on github,
but this is so much quicker! I wonder if this could be turned into a chrome extension?
Is anyone doing this?

custom domain?

The tool is getting some traction, so it might deserve its own domain.

Move Pages to the master repo

As Github now allows rendering from the master repo, let's unify it for easier maintenance.

use flexbox/grid for cells and execution counts

I think the current layout is quite awkward and my css knowledge at the time predated all the cool and usable stuff introduced in the past five years

note: check browser usage in GA, maybe we have to support IE/Edge

Margins and paddings

Cells with text/html are not properly indented (vertically) and the Out [n] indicator is weirdly placed most of the time, the In [n] one as well, but less so.

For the cells, make their box settings common across all to avoid this. For the execution count indicators... improve the absolute positioning.

Github rendering - fix non-latin characters

There seem to be javascript issues with encoding/decoding base64 data with non-latin characters.

Github rendering - add error reporting

There are many things that could go wrong, add UI-visible error reporting:

URL does not match what we need (see the regexp)
data is not fetched from the API for some reason
API limits are exhausted

Better support for v3 notebooks

This one is really broken in nbviewer.js

https://github.com/jupyter/nbformat/blob/master/nbformat/tests/test3.ipynb

Find out what languages to implement

This relates to Prism.js (our current highlighter). Shall we hotload language support from a CDN or should we just bundle the most popular languages (Python, Julia, R, Haskell?).

The former makes the viewing tool leaner, but does not allow for offline use and involves a lag if language spec isn't cached. The latter allows for compact definitions and fewer HTTP calls since you can create bundles on the Prism website. But since these are custom, we'd need to vendor this. And there'd be no support for other languages, unless we hotloaded them.

And that might be a good compromise - to bundle the most popular languages and hotload any unsupported ones.

(A good way to survey language popularity in Jupyter would be to search BigQuery, which has all the code and reportedly hundreds of thousands of notebooks.)

LaTeX support

Two things to consider.

Do we go with MathJax or KaTeX or something else entirely?
How does it square off with Markdown parsing and rendering? Shall we parse all the maths first and feed it in with the markdown as pure HTML? Or do any markdown parsers allow for custom elements?

Our currently used library, marked.js, has an open pull request to support all this.

Problem when rendering ipynb

I found some error when I rendered .ipynb like this.

It should be

In this commit 8098772
I replace
dm.setAttribute('src', 'data:' + fmt + ';base64,' + dt.data[fmt].join(''));
to
dm.setAttribute('src', 'data:' + fmt + ';base64,' + dt.data[fmt]);

It works fine.
When I print the dt.data[fmt], it shows a base64 url.
I think the origin one is right.
If there is something that I've missed, please notify me

Make it modular wrt the Markdown converter and code highlighter

Currently dependent on Marked and Prism.js, but ideally modular with respect to both. The Markdown converter should be trivial (assuming it doesn't break potential LaTeX support), the code highlighter might be dependent on some HTML (e.g. Prism needs a language name in the <code> block).

Visual testing

Since it's quite hard to build proper tests (see #33), we can start by having non-automated visual tests. What I envisage the workflow to be:

Take screenshots of various notebooks in nbviewer.js
Work on whatever changes you want
Take the same screenshots again
Use something like pixelmatch to diff these

What we need in terms of infrastructure:

sample notebooks, versioned
a way to offer inject a notebook into the viewer (maybe we'll have a separate HTML file for testing, without the dropzone stuff)
some code to automate the screenshot taking and diffing

Adhere to specs

There is a clear explanation of the spec on the nbformat page. There's also a JSON spec for both v3 and v4. There's a few things we're not supporting - collapsed, scrolled, errors, some mime types (especially in v3).

Steps now

Skim the specs and find out what is not supported. List it here to check off.
Split out to separate issues if quite big - eg JavaScript support is probably more complicated than eval, or is it?
Make sure we err when not supporting something - console.error at least, possibly in the DOM as well.
Track the changelog to find out what's changed.

More general wrapping

There is basic wrapping implemented in 1aa9045, but I'd like it to be extended to all pre elements - well, all but code inputs, which should overflow: scroll.

Also, make sure that the current CSS implementation is the most compatible one, there seems to be some contention about that.

Render on double click

There have been multiple requests to support rendering the file upon a double click.

An initial implementation is in /cmd/, it's a Go file that embeds all the necessary JS/HTML/CSS, saves it to a temporary folder, alongside the clicked file. Then opens the browser.

Rendering on mobile

One of the reasons for this is that Github's Jupyter rendering does not work on mobile Safari for some reason. There isn't anything preventing us from supporting it, this is more about explicit CSS directives to make it usable there.

Choose a licence

Preferably

No restrictions on use or packaging (ie the licences of the work that uses this code) in private projects. I'd like to avoid any licensing conflicts.
Some basic attribution
It would be nice if people had to offer their forks to be merged here, to foster contribution and prevent diverging implementations, but I don't want it to be in conflict with point 1

testing: add CI

Screencast missing

I had a quick demo in README.md, inlined as a gif. Not sure where it went, but should be reinstated at some point.

Only the first cell is visible with certain files

When viewing a notebook generated by Literate.jl, only the first cell is displayed. Opening their example notebook using the online preview shows the problem.

This is the same issue as nbviewer-app issue 6, where @tuxu tells me that in "the first output cell, [] nbviewer.js expects an array, but only a string literal is found".

testing: compare to HTML to avoid regressions

Support exactly one cell output

As discovered in #28, we sometimes render multiple cell outputs, but we should always render just one. There is an ordering defined in nbconvert, so we should check each when parsing the data field.

Before implementing this

let's check a few notebooks (from tests, see #33) and how they render before and after.
see if we can implement some of the mime types that are not yet supported (e.g. application/pdf?)
grep as many notebooks as possible and check their mime types - maybe there our some extra mime types out there

Render TOC

Option in settings. Render TOC based on h1-h6. Do not presume that a given Markdown library creates relevant header IDs.

Track nbformat changelog

Track the nbformat changelog to make sure we haven't missed any new features.

Build regression tests and a harness

I'm not sure what the best course of action here is. Whether we should just care about DOM generation or if we should go full "screenshot testing" - using Selenium or Puppeteer.

The main reason is to test things like overflows in pres and imgs, multiple outputs, legacy versions etc. Things that tend to break with changes to the library code.

Does not use neural networks.

Please add this. Priority: high.

IE does not support startsWith method of string object.

Slideshow support

There is a reveal-backed support for slides in nbconvert.

I haven't looked into the implementation, not sure how it's built.

GH: base64 encoding

Data we get from the Github API is base64 encoded. Sadly, there are issues with b64 in JavaScript, notably with non-latin characters. There are ways of resolving this, but these involve several dependencies that we would like to avoid.

Another solution would be to submit a second AJAX request to the "true" URL of the target file and download it directly.

Handle all string/array enums correctly

There are quite a few fields that may be plain strings or arrays of strings - we handle them in various ways - either via typeofs or Array.isArray. We should have a helper that consumes a variable of type string|array[string] and outputs a string. That should solve pretty much all of these cases.

Last time this surfaced was in #44.

kokes / nbviewer.js Goto Github PK

nbviewer.js's People

Contributors

Stargazers

Watchers

Forkers

nbviewer.js's Issues

Recommend Projects

Recommend Topics

Recommend Org