livebook-dev / kino_explorer Goto Github PK

View Code? Open in Web Editor NEW

26.0 26.0 12.0 263 KB

Explorer (series and dataframes) integration for Livebook

License: Apache License 2.0

Elixir 71.69% CSS 9.85% JavaScript 18.45%

data-visualization elixir explorer smart-cells

kino_explorer's People

Contributors

Stargazers

Watchers

Forkers

cristineguadelupe jannikbecher capitalist42 hanspagh hez sasikumar87 mfeckie clayscode lkarthee roger120981 h12 munksgaard

kino_explorer's Issues

Show if a column is in a group

@josevalim How would you like it to be shown? An indication in the header?

Add filter support for conjunctions (and), disjunctions (or), and negations (not)

Hopefully this is welcome! I am really impressed with what you are doing here, I started on a little prototype for a filtering system myself, but am nowhere as far along as this.

I want to suggest a feature and a data structure for adding some really cool functionality to filters.

Problem

The current approach for filters uses conjunctions by default, but it is often useful to combine conjunctions and disjunctions together.

Solution

Add support for conjunctions (and), and disjunctions (or).

This can be supported by recursively nesting query filters via "value":

Example conjunction with a disjunction:

filters: [
  %{
    "filter" => "and",
    "value" => [
      %{
        "column" => "year",
        "filter" => "equal",
        "value" => 2010
      },
      {
        "filter" => "or",
        "value" => [
          %{
            "column" => "country",
            "filter" => "equal",
            "value" => "Angola"
          },
          %{
            "column" => "country",
            "filter" => "equal",
            "value" => "Algeria"
          }
        ]
      }
    ]
  }
]

With this you can combine and nest conjunctions, and disjunctions as needed. It also works well for building a UI with nested objects.

Negations
I also wanted to suggest supporting negations directly for any given filter operator, e.g

%{
  "filter" => "not equal",
}

becomes

%{
  "filter" => "equal",
  "negate" => true,
}

Negations then become a really simple toggle for any given filter. In your code you then can define a single operation direction, and the inverse is just a negation.

P.S.
You can see an example of how this might work in this repo. I'm a novice at elixir, but I think this is illustrative of the idea.

Make columns draggable

A nice to have feature would be the ability to rearrange the order of columns. Either via the table itself, or the ability to rearrange the tags in the "Select" action. It's possible to re-arrange the columns using "Select" by just adding the tags in the order you'd like, but if you forget a tag or would like to change the order, you have to delete every tag up until the place you wanted to change the order.

For instance:

Using the select action makes it possible to set the order of columns. However if I'd like to change the order of Embarked and PassengerID, I'd have to delete all the tags and then re-add them in the order I'd like. This is especially cumbersome when you have lots of columns.

In my mind, it seems simpler to make the multiselect field draggable using the draggable component (trying to figure out how, but haven't been successful so far), but if it would be easier to add the draggable component to the columns themselves, that would work as well.

Emit lazy queries by default

We should call to_lazy() on the dataframe and then collect() at the end. Although I think we need to collect before pivot wider. We should have a toggle at the top to choose if we want it lazy or not, it should be on by default.

Add discard operation

Some datasets contain unnecessary data, which is not useful in further processing. Why not to add option to remove those columns from dataset?

Free order operations (except pivot)

Guide

Sort as unordered operation

#32

Export does not work for lazy data frames

We need to call collect() on them. Since collect() is a no-op for regular dataframes, we can always just call it. :)

Solve issue with drag and drop blinking

Allow filtering by mean, median, and quantile

We should allow filtering by the mean, median, and quantile.

For example, if I want to filter a dataset for values that above its mean, we could do:

Filter by      operator       value
septal_length  greater than   mean

UI wise, this has a couple issues:

value is a number, how could we input the mean? Perhaps a select?
for quantiles, we would need a fourth input specifying the quantile but that's likely too complex, we could instead provide predefined quantiles for 10%, 20%, ..., 90%

Preparing for the next `Explorer` release

The next Explorer release will bring several changes that directly affect KinoExplorer.
This issue is to track all these changes and make sure KinoExplorer is in sync and ready for the next release.

Update DataFrame.arrange as DataFrame.sort elixir-explorer/explorer#777
Add lists to the operations that support it
Ensure that all new types and lists are properly supported

Remove the restriction on pivot_wider dtypes

elixir-explorer/explorer#733

Update types - Explorer 0.7.2

Random redirect to empty page from KinoExplorer Data Transform Smart Cell

The "assign to" input box redirects me to a junk link if I hit enter.
Specifically I get sent to:
http://localhost:63081/iframe/v5.html?data_frame=processes&assign_to=

I am on chromium with mac m1

Minimum reproducible example:

Create a cell with

keys = [:registered_name, :initial_call, :reductions, :stack_size]

processes =
  for pid <- Process.list(),
      info = Process.info(pid, keys),
      do: info

Kino.DataTable.new(processes)

Add a data transform smart cell
Focus the "Assign To" box, and hit enter

Fan note:

I gotta say this is so freaking fantastic!!!!!

I actually started my own little project that tried to do some macro cells/smart cells before realizing how far livebooks have come. I definitely want to start contributing, I am wildly impressed.

Implement fill missing

Filters

Export the inspected representation

From Kino v0.11, Kino.Live.JS can provide custom exports and we should export the inspected DataFrame representation.

Show Series as a single column table and title Series

Update `data_options` on delete

Improve tests

Our tests are more complex than they should be, verbose and the @base_operations approach is no longer needed

Multiple value columns in pivot wider

elixir-explorer/explorer#538

Cumulative filtering

Convert table filtering into a smart cell operation

Today we can perform filters on the cell. We will have a button that converts those operations into a smart cell. The smart cell will be developed in #3.

Top statistics not clear when majority is null

First of all very excited to see this coming to livebook.

It took me a bit of time to release why the top statistics were empty for the last column in the screenshot, maybe it is just me.
I also think we could have min/max statistics for datetime and date instead of top.
I would be happy to help fixing this, if needed

Smart Cell

We should support:

Filters
Sorting
Pivot
Group by

Allow any data structure that implements Table.Reader as argument

And then, instead of the first line of the pipeline simply being "var", it will be Explorer.DataFrame.new(var), which is also the code we will emit.

We should also change the warning message of when no data structure is found to something like: "The Data Transform smart cells works with Explorer DataFrames and table-like data structures but none was found."

Should we also change the name of the first label? Currently it says "DATA FRAMES" but I can't think of anything better. Perhaps just "DATA"?

Support Infinities in DataFrames/Series

Currently, when a series or data frame contains infinities there is an error.

for input

DF.new(a: Nx.tensor([:infinity], type: {:f, 64}))

and

S.from_tensor(Nx.tensor([:infinity], type: {:f, 64}))

15:35:11.764 [error] ** (MatchError) no match of right hand side value: {:error, {:badarg, [{Explorer.PolarsBackend.Native, :s_mean, [#Explorer.PolarsBackend.Series<
shape: (1,)
Series: 'series' [f64]
[
inf
]
>], []}, {Explorer.PolarsBackend.Shared, :apply_series, 3, [file: 'lib/explorer/polars_backend/shared.ex', line: 23]}, {Kino.Explorer, :"-summaries/1-fun-0-", 2, [file: 'lib/kino/explorer.ex', line: 94]}, {:maps, :fold_1, 3, [file: 'maps.erl', line: 411]}, {Kino.Explorer, :summaries, 1, [file: 'lib/kino/explorer.ex', line: 89]}, {Kino.Explorer, :init, 1, [file: 'lib/kino/explorer.ex', line: 42]}, {Kino.Table, :init, 2, [file: 'lib/kino/table.ex', line: 63]}, {Kino.JS.Live.Server, :call_init, 3, [file: 'lib/kino/js/live/server.ex', line: 76]}]}}
(kino 0.9.0) lib/kino/js/live.ex:326: Kino.JS.Live.new/2
(kino_explorer 0.1.2) lib/kino_explorer.ex:9: Kino.Render.Explorer.Series.to_livebook/1
lib/livebook/runtime/evaluator/default_formatter.ex:41: Livebook.Runtime.Evaluator.DefaultFormatter.to_output/1
lib/livebook/runtime/evaluator.ex:464: Livebook.Runtime.Evaluator.continue_do_evaluate_code/5
lib/livebook/runtime/evaluator.ex:326: Livebook.Runtime.Evaluator.loop/1
(stdlib 4.2) proc_lib.erl:240: :proc_lib.init_p_do_apply/3

Reordering (except pivot_wider)

Support lazy data frames

Crash when previous cell has unsupported data

I'm unable to use the smart cell because my notebook has MongoDB.ObjectId values.

    {:kino_explorer, "~> 0.1.9"},
    {:explorer, "~> 0.7.0"}

19:11:33.878 [error] GenServer #PID<0.2447.0> terminating
** (ArgumentError) cannot create series "_id": unsupported datatype: #BSON.ObjectId<633dde08069d480008ba9b9f>
    (explorer 0.7.0) lib/explorer/polars_backend/data_frame.ex:572: Explorer.PolarsBackend.DataFrame.series_from_list!/3
    (explorer 0.7.0) lib/explorer/polars_backend/data_frame.ex:522: anonymous fn/3 in Explorer.PolarsBackend.DataFrame.from_tabular/2
    (elixir 1.15.2) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
    (explorer 0.7.0) lib/explorer/polars_backend/data_frame.ex:516: Explorer.PolarsBackend.DataFrame.from_tabular/2
    (kino_explorer 0.1.10) lib/kino_explorer/data_transform_cell.ex:846: KinoExplorer.DataTransformCell.update_data_options/3
    (kino_explorer 0.1.10) lib/kino_explorer/data_transform_cell.ex:337: KinoExplorer.DataTransformCell.updates_for_data_frame/2
    (kino_explorer 0.1.10) lib/kino_explorer/data_transform_cell.ex:189: KinoExplorer.DataTransformCell.handle_info/2
    (kino 0.10.0) lib/kino/js/live/server.ex:134: Kino.JS.Live.Server.call_handle_info_fallback/3
Last message: {:scan_binding_result, [videos: [%{"_id" => #BSON.ObjectId<633de7ac069d480008ba9bb3>, ... (truncated)

The Smart cell crashed unexpectedly, this is most likely a bug.
Restart Smart cell

The previous cell also has a valid Explorer.DataFrame towards the end, but, since the Smart Cell crashed, I cannot use it.

Add data exports/downloads

We can start with CSV but the underlying mechanism should be customizable, because we can also provide Parquet format for Explorer, but not for regular Kino.DataTable.

{:list, numeric} data not displayed - float errors and integer renders as charlists

Floats

alias Explorer.Series

s = Series.from_list([[1], [2]], dtype: {:list, :float})

Below error occurs.

10:11:34.804 [error] GenServer #PID<0.290.0> terminating
** (ArgumentError) cannot convert the given list to a string.

To be converted to a string, a list must either be empty or only
contain the following elements:

  * strings
  * integers representing Unicode code points
  * a list containing one of these three elements

Please check the given list or call inspect/1 to get the list representation, got:

[1.0]

Integers

alias Explorer.Series

s = Series.from_list([[1], [2]], dtype: {:list, :integer})

undefined function sigil_U/2 (there is no such import)
    (elixir 1.14.2) src/elixir_expand.erl:587: :elixir_expand.expand_arg/3
    (elixir 1.14.2) src/elixir_expand.erl:603: :elixir_expand.mapfold/5
    (elixir 1.14.2) src/elixir_expand.erl:867: :elixir_expand.expand_remote/8
    (elixir 1.14.2) src/elixir_expand.erl:527: :elixir_expand.expand_block/5
    (elixir 1.14.2) src/elixir_expand.erl:40: :elixir_expand.expand/3
    (elixir 1.14.2) src/elixir_clauses.erl:45: :elixir_clauses.clause/6
    (elixir 1.14.2) src/elixir_fn.erl:17: anonymous fn/4 in :elixir_fn.expand/4
    (explorer 0.5.7) expanding macro: Explorer.Query.query/1

Allow summarize without groups

elixir-explorer/explorer#601