Giter Club home page Giter Club logo

Comments (5)

manzt avatar manzt commented on June 3, 2024

I like the idea. However, this still assumes that we want only subsets of the data. What if, for example, I was generating batches of points and wanted to add them to an existing plot as the computation runs. In addition, what are selection semantics when we have filtered the data? Does selection=[0] correspond to the original data or the new filtered subset? We then need to keep track of indices. What if you filter again? is that a filter on the existing filter or on the original data?

I think a more general API would be to allow the dataframe to be "reative" like encodings, keeping all the other state and updating the plot on reassignment:

scatter = Scatter(data=df, x='x', y='y')

display(scatter.show())

for new_data in random_data_generator():
    time.sleep(1)
    scatter.data(new_data) # clears selection (if there was one), keeps all existing encodings

This way you could just filter your dataframe by whatever means make sense. For example, two scatter plots where the second just displays the selection of the first:

s1 = Scatter(data=df, x='x', y='y')
s2 = Scatter(data=df, x='x', y='y')

def on_selection_change(change):
    subset = df.iloc[change.new]
    s2.data(subset)

s1.widget.observe(on_selection_change, names='selection')

ipywidgets.HBox([s1.show(), s2.show()])

from jupyter-scatter.

flekschas avatar flekschas commented on June 3, 2024

I think we're talking about two different ideas: filtering and updating the data. The difference is that filtering is extremely cheap because it only ever results in rendering a subset of the data that was already uploaded to the GPU. Changing the data is more expensive. Since there is no guarantee if any of the existing data can be reused, one has to flush the existing data and upload the new data to the GPU on every data change.

I think there are valid use cases for both. However, in your example, changing the data is unnecessary expensive as you could simply just render out a subset.

Implementing scatter.data() should be fairly simple. What's more tricky is scatter.filter() as it'll require changes to regl-scatterplot.

what are selection semantics when we have filtered the data? Does selection=[0] correspond to the original data or the new filtered subset? We then need to keep track of indices. What if you filter again? is that a filter on the existing filter or on the original data?

The semantics are fairly simple: your filter always operates on the bound data and is only every affecting the rendering (not the underlying data itself). E.g., scatter.filter([0, 1, 2]) will cause the plot to only render out the first three points. If you then call scatter.filter([3, 4, 5]), the plot would render the forth to sixth point (of your bound dataframe). The selection semantics remain the same. The point with index 10 is always going to reference the same point because the data itself isn't filtered.

from jupyter-scatter.

manzt avatar manzt commented on June 3, 2024

Ok, that makes a lot of sense to me (and I see the motivation for filter!) I think both are valuable and handle two separate ideas, as you noted. There are certainly performance benefits to filtering.

from jupyter-scatter.

flekschas avatar flekschas commented on June 3, 2024

Here's a demo of visually filtering out points. It's super snappy as the only thing that changes is the buffer that indexes into the texture object :)

Screen.Recording.2023-03-02.at.8.45.20.PM.720p-00.00.04.236-00.00.33.420.mp4

I imagine the data() function be fairly simple. I think all we need is a single argument:

scatter.data(df)

from jupyter-scatter.

flekschas avatar flekschas commented on June 3, 2024

Closing this as the two function have been added with #63 and released in v0.11.0.

from jupyter-scatter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.