Giter Club home page Giter Club logo

Comments (7)

NickCrews avatar NickCrews commented on June 26, 2024

OK I just did a quick prototype. Per some googling, I found this article, which recommends brotli for JSON. Brotli is well-defined, mature, and available in most languages.

Here is some python that encodes the json with brotli, and then base64 encodes it so it is valid in a URI:

import brotli
import base64
import altair as alt
from vega_datasets import data

source = data.cars()
chart = alt.Chart(source).mark_point().encode(
    x='Horsepower',
    y='Miles_per_Gallon',
    size='Acceleration'
)
json = chart.to_json()
json_bytes = json.encode("utf-8")
compressed = brotli.compress(json_bytes)
encoded = base64.b64encode(compressed)
decoded = base64.b64decode(encoded)
decompressed = brotli.decompress(decoded)
json_restored = decompressed.decode("utf-8")
assert json_restored == json
print(encoded)
# b'W8zWIYqypFoCrAu4w8rgT3h3YQLVJXA+/lZfywhGSDK7uVQegr....
print(len(encoded))
# 9156

This is that chart in the web viewer

If I share that URL, it is 19512 characters long, twice the length of the 9k from the brotli encoding (if I am understanding this right)!

So not only would we get a more portable API, but also we get a performance boost! With the current lz implementation, if I have very many datapoints (> ~500??) then the URL gets too long and many apps like slack and Google Docs stop working with them, defeating the point.

from editor.

domoritz avatar domoritz commented on June 26, 2024

Vega-Embed has an "open in the editor" action that uses an API in the editor to load the spec. Maybe that works as well (although you can only trigger it from js: https://github.com/vega/vega-embed/blob/880d55f7cf57e27716c510ad73715db877cd718c/src/post.ts#L29).

I am all open to use brotli but we do need to be backwards compatible. We need to use a different URL, I suppose. Would you like to send a pull request?

from editor.

NickCrews avatar NickCrews commented on June 26, 2024

I'm a little intimidated by the typescript, I have never worked with it before. I can try to task a stab at it, but I might give up.

First, we should figure out a rough shape of the API. To make it backward-compatible, we could move to parameters, something like:

https://vega.github.io/editor/#/url/vega-lite?
brotli={encoded}&
view={"edit"/"fullscreen"/something else}

but IDK, I am no expert in web APIs so I'm not sure what the conventions are here. I want to figure this out before I actually implement anything.

Like should it be this?

https://vega.github.io/editor/#/url?
format={"vega-lite"/"vega"/something else}&
encoding={"brotli"/"lzstring"/something else}&
data={encoded}&
view={"edit"/"fullscreen"/something else}

from editor.

domoritz avatar domoritz commented on June 26, 2024

These are the current URLs.

https://vega.github.io/editor/#/examples/vega-lite/rect_heatmap
https://vega.github.io/editor/#/custom/vega
https://vega.github.io/editor/#/custom/vega-lite
https://vega.github.io/editor/#/gist/455e1c7872c4b38a58b90df0c3d7b1b9/bar.vl.json
https://vega.github.io/editor/#/url/vega-lite/N4IgJAzgxgFgpgWwIYgFwhgF0wBwqgegIDc4BzJAOjIEtMYBXAI0poHsDp5kTykBaADZ04JACyUAVhDYA7EABoQAEzjQATjRyZ289AEEABBBoIcguIaZJ1h2DcyGA7nRiHETOMtXLDypJhUiiAuyvRoAMwAbAAMSv6BaKDESIIMamgA2qAoBsFMaABMABwAvgo5aCAAQvloAKz15ZXoAMJ1qGIRzSC5IAAiHQCcAIw9fQCiHcVjFb1VAGId9d1zfQDiHSND41UAEtMA7LvoAJLLhaUAuuUgyOoA1lXW6sFwslBsyjSyZEkgAA9-gAzGhwQTKKooJSYACeODgVTY6m+slSIFusJBYIhz2CcIRVQAjgwkLIdIEdKQMTC2GxBDocNjwZD0AUYfDEegSWSKQEaNTSkKgA
https://vega.github.io/editor/#/url/vega-lite/N4IgJAzgxgFgpgWwIYgFwhgF0wBwqgegIDc4BzJAOjIEtMYBXAI0poHsDp5kTykBaADZ04JAKyUAVhDYA7EABoQAEzjQATjRyZ289AEEABBBoIcguIaZJ1h2DcyGA7nRiHETOMtXLDypJhUiioBKKigxEiCDGpoANqgYSD6wUxoAEwAHAC+ColoIABCqWhiYrn56ADCJagALADMFSBJACK1AJwAjM1JAKK1mT15LQUAYrViTSNJAOK1XR29BQASgwDsy+gAkpPp2QC6uSDI6gDWBdbqwXCyUGzKNLJkaKAAHq8gAGY0cILKBRQSkwAE8cHACrI2AgnlFgkg3jQIJ9BEhPIJ9M8LGgAAzZY4gz4-P4A9BpYFgiHoACODCQsh0gR0pBA+OyQA/view

Maybe the easiest would be to have a prefix brotli- before the encoded string. We could check for that prefix and then call the right decoder. It's not totally failsafe since the encoded string might randomly have this format but it seems super unlikely.

https://vega.github.io/editor/#/url/vega-lite/brotli-XXXX

Alternatively, we could use https://vega.github.io/editor/#/url/vega-lite/brotli/XXXX or https://vega.github.io/editor/#/url-brotli/vega-lite/XXXX.

from editor.

domoritz avatar domoritz commented on June 26, 2024

Btw, I looked at lz-string and the replacement for uri encoding is pretty simple: https://github.com/pieroxy/lz-string/blob/35cdd797ae7415211add846e529669643e893904/src/main.ts#L136C16-L136C29. Maybe you can dig a bit more to see whether you can replicate it in Python. Brotli will add some overhead in terms of bundle size that I want to be careful with.

from editor.

NickCrews avatar NickCrews commented on June 26, 2024

I suppose if lz-string is always going to be supported by the editor, then there is no harm in adding it to Altair, it should be a single python file. Then moving to brotli could be a later discussion.

What do you think about the better compression of brotli? Have you heard from any other users complaints about long URLs?

from editor.

domoritz avatar domoritz commented on June 26, 2024

I thought Brotli was for fast compression, not small.

I've not heard about many issues with long urls. If the spec is large, I always recommend gist or the api I linked to above.

from editor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.