livebook-dev / kino_explorer Goto Github PK
View Code? Open in Web Editor NEWExplorer (series and dataframes) integration for Livebook
License: Apache License 2.0
Explorer (series and dataframes) integration for Livebook
License: Apache License 2.0
@josevalim How would you like it to be shown? An indication in the header?
Hopefully this is welcome! I am really impressed with what you are doing here, I started on a little prototype for a filtering system myself, but am nowhere as far along as this.
I want to suggest a feature and a data structure for adding some really cool functionality to filters.
The current approach for filters uses conjunctions by default, but it is often useful to combine conjunctions and disjunctions together.
Add support for conjunctions (and), and disjunctions (or).
This can be supported by recursively nesting query filters via "value"
:
Example conjunction with a disjunction:
filters: [
%{
"filter" => "and",
"value" => [
%{
"column" => "year",
"filter" => "equal",
"value" => 2010
},
{
"filter" => "or",
"value" => [
%{
"column" => "country",
"filter" => "equal",
"value" => "Angola"
},
%{
"column" => "country",
"filter" => "equal",
"value" => "Algeria"
}
]
}
]
}
]
With this you can combine and nest conjunctions, and disjunctions as needed. It also works well for building a UI with nested objects.
Negations
I also wanted to suggest supporting negations directly for any given filter operator, e.g
%{
"filter" => "not equal",
}
becomes
%{
"filter" => "equal",
"negate" => true,
}
Negations then become a really simple toggle for any given filter. In your code you then can define a single operation direction, and the inverse is just a negation.
P.S.
You can see an example of how this might work in this repo. I'm a novice at elixir, but I think this is illustrative of the idea.
A nice to have feature would be the ability to rearrange the order of columns. Either via the table itself, or the ability to rearrange the tags in the "Select" action. It's possible to re-arrange the columns using "Select" by just adding the tags in the order you'd like, but if you forget a tag or would like to change the order, you have to delete every tag up until the place you wanted to change the order.
For instance:
Using the select action makes it possible to set the order of columns. However if I'd like to change the order of Embarked and PassengerID, I'd have to delete all the tags and then re-add them in the order I'd like. This is especially cumbersome when you have lots of columns.
In my mind, it seems simpler to make the multiselect field draggable using the draggable component (trying to figure out how, but haven't been successful so far), but if it would be easier to add the draggable component to the columns themselves, that would work as well.
We should call to_lazy()
on the dataframe and then collect()
at the end. Although I think we need to collect before pivot wider. We should have a toggle at the top to choose if we want it lazy or not, it should be on by default.
Some datasets contain unnecessary data, which is not useful in further processing. Why not to add option to remove those columns from dataset?
We should allow filtering by the mean, median, and quantile.
For example, if I want to filter a dataset for values that above its mean, we could do:
Filter by operator value
septal_length greater than mean
UI wise, this has a couple issues:
value is a number, how could we input the mean? Perhaps a select?
for quantiles, we would need a fourth input specifying the quantile but that's likely too complex, we could instead provide predefined quantiles for 10%, 20%, ..., 90%
The next Explorer
release will bring several changes that directly affect KinoExplorer
.
This issue is to track all these changes and make sure KinoExplorer
is in sync and ready for the next release.
DataFrame.arrange
as DataFrame.sort
elixir-explorer/explorer#777The "assign to" input box redirects me to a junk link if I hit enter.
Specifically I get sent to:
http://localhost:63081/iframe/v5.html?data_frame=processes&assign_to=
I am on chromium with mac m1
keys = [:registered_name, :initial_call, :reductions, :stack_size]
processes =
for pid <- Process.list(),
info = Process.info(pid, keys),
do: info
Kino.DataTable.new(processes)
I gotta say this is so freaking fantastic!!!!!
I actually started my own little project that tried to do some macro cells/smart cells before realizing how far livebooks have come. I definitely want to start contributing, I am wildly impressed.
From Kino v0.11, Kino.Live.JS can provide custom exports and we should export the inspected DataFrame representation.
Our tests are more complex than they should be, verbose and the @base_operations
approach is no longer needed
Today we can perform filters on the cell. We will have a button that converts those operations into a smart cell. The smart cell will be developed in #3.
First of all very excited to see this coming to livebook.
It took me a bit of time to release why the top
statistics were empty for the last column in the screenshot, maybe it is just me.
I also think we could have min/max statistics for datetime and date instead of top.
I would be happy to help fixing this, if needed
We should support:
And then, instead of the first line of the pipeline simply being "var", it will be Explorer.DataFrame.new(var)
, which is also the code we will emit.
We should also change the warning message of when no data structure is found to something like: "The Data Transform smart cells works with Explorer DataFrames and table-like data structures but none was found."
Should we also change the name of the first label? Currently it says "DATA FRAMES" but I can't think of anything better. Perhaps just "DATA"?
Currently, when a series or data frame contains infinities there is an error.
for input
DF.new(a: Nx.tensor([:infinity], type: {:f, 64}))
and
S.from_tensor(Nx.tensor([:infinity], type: {:f, 64}))
15:35:11.764 [error] ** (MatchError) no match of right hand side value: {:error, {:badarg, [{Explorer.PolarsBackend.Native, :s_mean, [#Explorer.PolarsBackend.Series<
shape: (1,)
Series: 'series' [f64]
[
inf
]
>], []}, {Explorer.PolarsBackend.Shared, :apply_series, 3, [file: 'lib/explorer/polars_backend/shared.ex', line: 23]}, {Kino.Explorer, :"-summaries/1-fun-0-", 2, [file: 'lib/kino/explorer.ex', line: 94]}, {:maps, :fold_1, 3, [file: 'maps.erl', line: 411]}, {Kino.Explorer, :summaries, 1, [file: 'lib/kino/explorer.ex', line: 89]}, {Kino.Explorer, :init, 1, [file: 'lib/kino/explorer.ex', line: 42]}, {Kino.Table, :init, 2, [file: 'lib/kino/table.ex', line: 63]}, {Kino.JS.Live.Server, :call_init, 3, [file: 'lib/kino/js/live/server.ex', line: 76]}]}}
(kino 0.9.0) lib/kino/js/live.ex:326: Kino.JS.Live.new/2
(kino_explorer 0.1.2) lib/kino_explorer.ex:9: Kino.Render.Explorer.Series.to_livebook/1
lib/livebook/runtime/evaluator/default_formatter.ex:41: Livebook.Runtime.Evaluator.DefaultFormatter.to_output/1
lib/livebook/runtime/evaluator.ex:464: Livebook.Runtime.Evaluator.continue_do_evaluate_code/5
lib/livebook/runtime/evaluator.ex:326: Livebook.Runtime.Evaluator.loop/1
(stdlib 4.2) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
I'm unable to use the smart cell because my notebook has MongoDB.ObjectId values.
{:kino_explorer, "~> 0.1.9"},
{:explorer, "~> 0.7.0"}
19:11:33.878 [error] GenServer #PID<0.2447.0> terminating
** (ArgumentError) cannot create series "_id": unsupported datatype: #BSON.ObjectId<633dde08069d480008ba9b9f>
(explorer 0.7.0) lib/explorer/polars_backend/data_frame.ex:572: Explorer.PolarsBackend.DataFrame.series_from_list!/3
(explorer 0.7.0) lib/explorer/polars_backend/data_frame.ex:522: anonymous fn/3 in Explorer.PolarsBackend.DataFrame.from_tabular/2
(elixir 1.15.2) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
(explorer 0.7.0) lib/explorer/polars_backend/data_frame.ex:516: Explorer.PolarsBackend.DataFrame.from_tabular/2
(kino_explorer 0.1.10) lib/kino_explorer/data_transform_cell.ex:846: KinoExplorer.DataTransformCell.update_data_options/3
(kino_explorer 0.1.10) lib/kino_explorer/data_transform_cell.ex:337: KinoExplorer.DataTransformCell.updates_for_data_frame/2
(kino_explorer 0.1.10) lib/kino_explorer/data_transform_cell.ex:189: KinoExplorer.DataTransformCell.handle_info/2
(kino 0.10.0) lib/kino/js/live/server.ex:134: Kino.JS.Live.Server.call_handle_info_fallback/3
Last message: {:scan_binding_result, [videos: [%{"_id" => #BSON.ObjectId<633de7ac069d480008ba9bb3>, ... (truncated)
The Smart cell crashed unexpectedly, this is most likely a bug.
Restart Smart cell
The previous cell also has a valid Explorer.DataFrame towards the end, but, since the Smart Cell crashed, I cannot use it.
We can start with CSV but the underlying mechanism should be customizable, because we can also provide Parquet format for Explorer, but not for regular Kino.DataTable.
alias Explorer.Series
s = Series.from_list([[1], [2]], dtype: {:list, :float})
Below error occurs.
10:11:34.804 [error] GenServer #PID<0.290.0> terminating
** (ArgumentError) cannot convert the given list to a string.
To be converted to a string, a list must either be empty or only
contain the following elements:
* strings
* integers representing Unicode code points
* a list containing one of these three elements
Please check the given list or call inspect/1 to get the list representation, got:
[1.0]
alias Explorer.Series
s = Series.from_list([[1], [2]], dtype: {:list, :integer})
Once the button is clicked, it will add the Explorer smart cell. So this will need integration into Livebook/Kino.
Mockup incoming.
When trying to have a filter compare to e.g. 1970-01-02 00:00:00Z
undefined function sigil_U/2 (there is no such import)
(elixir 1.14.2) src/elixir_expand.erl:587: :elixir_expand.expand_arg/3
(elixir 1.14.2) src/elixir_expand.erl:603: :elixir_expand.mapfold/5
(elixir 1.14.2) src/elixir_expand.erl:867: :elixir_expand.expand_remote/8
(elixir 1.14.2) src/elixir_expand.erl:527: :elixir_expand.expand_block/5
(elixir 1.14.2) src/elixir_expand.erl:40: :elixir_expand.expand/3
(elixir 1.14.2) src/elixir_clauses.erl:45: :elixir_clauses.clause/6
(elixir 1.14.2) src/elixir_fn.erl:17: anonymous fn/4 in :elixir_fn.expand/4
(explorer 0.5.7) expanding macro: Explorer.Query.query/1
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.