Giter Club home page Giter Club logo

nbviewer.js's People

Contributors

kokes avatar tuxu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nbviewer.js's Issues

GH: Error handling/reporting

There is currently no handling of errors in the github repo, there will be two classes of these:

  1. General network errors
  2. Going over the API limit

We might even report what's remaining of the github limit (60/hour) to the user, it's reported in the headers of our HTTP payload.

Inline CSS

Style all the elements in JavaScript, there's only a few directives, that way we can avoid a dependency and make the library into a single file.

I presume the performance penalty for styling each element instead of document-wide application is negligible.

GH: validate inputs

The github viewer should validate inputs, but these need to be identical to what github allows username/company and repo names to be.

output type: error

Try 1/0 in a cell and then render here, it won't work.

We need to handle output type: error.

  {
   "cell_type": "code",
   "execution_count": 180,
   "metadata": {},
   "outputs": [
    {
     "ename": "ZeroDivisionError",
     "evalue": "division by zero",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mZeroDivisionError\u001b[0m                         Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-180-05c9758a9c21>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;36m1\u001b[0m\u001b[0;34m/\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mZeroDivisionError\u001b[0m: division by zero"
     ]
    }
   ],
   "source": [
    "1/0"
   ]
  },

Nicer rendering of tables

We could perhaps improve how tables are rendered, especially if they are DataFrames (alternating colours, white and gray, bold indices).

ANSI colours not supported

Originally reported here, but we also don't support this.

I created a reproducer (see branch ansi), but haven't looked into a fix yet.

This is what it should look like eventually:
Screenshot 2021-02-17 at 12 46 16

Incorporate changes from Microsoft's fork

Microsoft forked this for their Azure offering. Go through their changelog and see if there are things we could merge in.

They kept the file structure mostly intact, so we could just diff and see it right there.

stderr handling

Instead of stdout, we sometimes get stderr, which should be handled appropriately.

{
   "cell_type": "code",
   "execution_count": 946,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [

Currently, we emit an error (but continue on) here

        // name in v4, stream in v3
        if ( (dt.name || dt.stream) != 'stdout' || dt.output_type != 'stream')
            console.error('unexpected stream spec');

We should

  1. Make sure we don't err here (instead of != 'stdout', we should have ~'not in (stdout, stderr)')
  2. Colour the output in red

To reproduce, try erring in Python (not fatally), trying something deprecated (e.g. writing to a view in pandas)

Example: warning.ipynb.txt

GH: UI

There is a number of elements in the header: instructions, examples, inputs, disclaimers, ... We should separate these out to some groups and make sure that the interface is clearer.

Download more samples and test them

I downloaded a sample from this piece of research by JetBrains link, it will be useful for testing.

It's a public S3 bucket, so it was as easy as aws s3 sync s3://github-notebooks-update1/ data/ (need to Control-C it, there's a lot of data)

Originally posted by @kokes in #48 (comment)

Supply a minified version

(Updating to Sierra, I managed to destroy my local brew/node toolkits, so I can't set it up now.)

Duplicate rendering

When both text/plain and text/html are supplied, both are rendered. But it seems that the former is just a fallback format in case the latter can't be rendered.

Check the spec if there are any other substitutes like this.

Chrome extension?

This is a great tool! I came here when my notebooks weren't rendering on github,
but this is so much quicker! I wonder if this could be turned into a chrome extension?
Is anyone doing this?

custom domain?

The tool is getting some traction, so it might deserve its own domain.

use flexbox/grid for cells and execution counts

I think the current layout is quite awkward and my css knowledge at the time predated all the cool and usable stuff introduced in the past five years

note: check browser usage in GA, maybe we have to support IE/Edge

Margins and paddings

Cells with text/html are not properly indented (vertically) and the Out [n] indicator is weirdly placed most of the time, the In [n] one as well, but less so.

For the cells, make their box settings common across all to avoid this. For the execution count indicators... improve the absolute positioning.

Github rendering - add error reporting

There are many things that could go wrong, add UI-visible error reporting:

  • URL does not match what we need (see the regexp)
  • data is not fetched from the API for some reason
  • API limits are exhausted

Find out what languages to implement

This relates to Prism.js (our current highlighter). Shall we hotload language support from a CDN or should we just bundle the most popular languages (Python, Julia, R, Haskell?).

The former makes the viewing tool leaner, but does not allow for offline use and involves a lag if language spec isn't cached. The latter allows for compact definitions and fewer HTTP calls since you can create bundles on the Prism website. But since these are custom, we'd need to vendor this. And there'd be no support for other languages, unless we hotloaded them.

And that might be a good compromise - to bundle the most popular languages and hotload any unsupported ones.

(A good way to survey language popularity in Jupyter would be to search BigQuery, which has all the code and reportedly hundreds of thousands of notebooks.)

LaTeX support

Two things to consider.

  1. Do we go with MathJax or KaTeX or something else entirely?
  2. How does it square off with Markdown parsing and rendering? Shall we parse all the maths first and feed it in with the markdown as pure HTML? Or do any markdown parsers allow for custom elements?

Our currently used library, marked.js, has an open pull request to support all this.

Problem when rendering ipynb

I found some error when I rendered .ipynb like this.

error

It should be

it_should_be

In this commit 8098772
I replace
dm.setAttribute('src', 'data:' + fmt + ';base64,' + dt.data[fmt].join(''));
to
dm.setAttribute('src', 'data:' + fmt + ';base64,' + dt.data[fmt]);

It works fine.
When I print the dt.data[fmt], it shows a base64 url.
I think the origin one is right.
If there is something that I've missed, please notify me

Make it modular wrt the Markdown converter and code highlighter

Currently dependent on Marked and Prism.js, but ideally modular with respect to both. The Markdown converter should be trivial (assuming it doesn't break potential LaTeX support), the code highlighter might be dependent on some HTML (e.g. Prism needs a language name in the <code> block).

Visual testing

Since it's quite hard to build proper tests (see #33), we can start by having non-automated visual tests. What I envisage the workflow to be:

  1. Take screenshots of various notebooks in nbviewer.js
  2. Work on whatever changes you want
  3. Take the same screenshots again
  4. Use something like pixelmatch to diff these

What we need in terms of infrastructure:

  • sample notebooks, versioned
  • a way to offer inject a notebook into the viewer (maybe we'll have a separate HTML file for testing, without the dropzone stuff)
  • some code to automate the screenshot taking and diffing

Adhere to specs

There is a clear explanation of the spec on the nbformat page. There's also a JSON spec for both v3 and v4. There's a few things we're not supporting - collapsed, scrolled, errors, some mime types (especially in v3).

Steps now

  1. Skim the specs and find out what is not supported. List it here to check off.
  2. Split out to separate issues if quite big - eg JavaScript support is probably more complicated than eval, or is it?
  3. Make sure we err when not supporting something - console.error at least, possibly in the DOM as well.
  4. Track the changelog to find out what's changed.

More general wrapping

There is basic wrapping implemented in 1aa9045, but I'd like it to be extended to all pre elements - well, all but code inputs, which should overflow: scroll.

Also, make sure that the current CSS implementation is the most compatible one, there seems to be some contention about that.

Render on double click

There have been multiple requests to support rendering the file upon a double click.

An initial implementation is in /cmd/, it's a Go file that embeds all the necessary JS/HTML/CSS, saves it to a temporary folder, alongside the clicked file. Then opens the browser.

Rendering on mobile

One of the reasons for this is that Github's Jupyter rendering does not work on mobile Safari for some reason. There isn't anything preventing us from supporting it, this is more about explicit CSS directives to make it usable there.

Choose a licence

Preferably

  1. No restrictions on use or packaging (ie the licences of the work that uses this code) in private projects. I'd like to avoid any licensing conflicts.
  2. Some basic attribution
  3. It would be nice if people had to offer their forks to be merged here, to foster contribution and prevent diverging implementations, but I don't want it to be in conflict with point 1

Screencast missing

I had a quick demo in README.md, inlined as a gif. Not sure where it went, but should be reinstated at some point.

Support exactly one cell output

As discovered in #28, we sometimes render multiple cell outputs, but we should always render just one. There is an ordering defined in nbconvert, so we should check each when parsing the data field.

Before implementing this

  1. let's check a few notebooks (from tests, see #33) and how they render before and after.
  2. see if we can implement some of the mime types that are not yet supported (e.g. application/pdf?)
  3. grep as many notebooks as possible and check their mime types - maybe there our some extra mime types out there

Render TOC

Option in settings. Render TOC based on h1-h6. Do not presume that a given Markdown library creates relevant header IDs.

Build regression tests and a harness

I'm not sure what the best course of action here is. Whether we should just care about DOM generation or if we should go full "screenshot testing" - using Selenium or Puppeteer.

The main reason is to test things like overflows in pres and imgs, multiple outputs, legacy versions etc. Things that tend to break with changes to the library code.

GH: base64 encoding

Data we get from the Github API is base64 encoded. Sadly, there are issues with b64 in JavaScript, notably with non-latin characters. There are ways of resolving this, but these involve several dependencies that we would like to avoid.

Another solution would be to submit a second AJAX request to the "true" URL of the target file and download it directly.

Handle all string/array enums correctly

There are quite a few fields that may be plain strings or arrays of strings - we handle them in various ways - either via typeofs or Array.isArray. We should have a helper that consumes a variable of type string|array[string] and outputs a string. That should solve pretty much all of these cases.

Last time this surfaced was in #44.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.