Comments (25)
Curious to know what's the current status of this? Specially now that jupyterlab/jupyter-collaboration#279 is merged? Thanks!
from jupyter_server.
Hey folks,
Since there appears to be interest from multiple people representing multiple organizations, I propose we try to meet regularly over the next couple of months to discuss this topic to 1) collect implementation ideas 2) develop a plan towards an open-source solution 3) coordinate who might be able to work on this.
At last week's server meeting, I proposed we reserve the last 20 minutes of the Jupyter Server/Kernels Meeting (8am Pacific) discuss this topic. Would folks on this thread be able to stop by regularly for those 20 minutes? Give this comment a 👍 to signal that you'd like to join.
We'll post notes from those discussions here. Thanks!
from jupyter_server.
In today's Jupyter Server call, we discussed this issue extensively. I'll do my best to summarize the discussion below based on some notes I took.
User story: As a user, I'd like to run a cell (or series of cells) in a notebook, close JupyterLab before the cells finish. When I come back, I'd expect to see
- the output of any completed cells,
- the execution count on those cells,
- if any cells are still running (with "*" in the corner),
- and the notebook's execution state showing accurate whether it is "idle"/"busy".
Today's UX: When the user closes/refreshes JupyterLab,
- no completed outputs
- no execution counts
- empty brackets where the execution count or "*" should be
- "idle" is shown even if it's busy (due to this issue)
Multiple solutions have been proposed over the past few weeks. Here's my attempt at summarizing below.
There are three layers to the problem:
- Message replay from the kernel.
- keep a history/log of all reply messages from the kernel that might have been missed
- Kernel state reconstruction from kernel message replay. We need a way take replay of messages and resolve:
execution_state
(busy/idle)execution_count
- all outputs
- Notebook model reconstruction from kernel message replay. We need a way take replay of messages and resolve:
- which cells are finished
- which cells still need executing
- outputs for finished cells
To begin building a solution, we proposed using a new, separate repo in jupyter-server org that
- Will provide a kernel message replay mechanism.
- Will provide an API to reconstruct kernel state from a replayed message stream.
- Will provide an API to reconstruct notebook model from a replayed message stream.
- Will provide a server extension that automatically configures Jupyter Server's kernel REST API to replay.
Question: Why can't we solve this problem with just the kernel replay and replay all message to JupyterLab when it reconnects?
Answer: I (Zach) believe you can. If the messages are timestamped, you should be able to just replay all messages from the last time the user was connected and the client should collapse this to the current state of the notebook. There are advantages to making the server more notebook document aware though. We can elaborate on this more in a separate thread.
Aside comment to keep in mind: kernel gateway / enterprise gateway add an addition passthrough layer where kernel messages can be missed and we need replay options.
How does Jupyter's RTC efforts play into this?
Y-docs offer a solution for rebuilding a notebook model server-side from log of patches/diffs/messages. Maybe we can leverage this machinery to store and resolve (2) and (3) once we have a message replay system in place?
from jupyter_server.
Very glad to hear that!
State retention of the JupyterLab front-end and the ipynb file itself is a major issue due to messaging delays and network effects, and will significantly impact the user experience if Jupyter Server is running in the cloud (This is the main problem I had before: JupyterLab always prompts whether to overwrite the file or not).
Based on this, I'm in favor of @davidbrochart‘s comment and jupyverse's solution: use Y-CRDT
to establish consistency, which only requires code execution and result writes to be put into the backend
I've put jupyter_kernel_executor on hold for now due to a change in focus at work. In this plugin, the user can execute the code through the HTTP interface, and the Jupyter server's handler performs the parsing of the zmq results and write them to the file, using a form that converts http to a localhost websocket connection (we can even connect to another Jupyter server, i.e., a remote kernel).
I regard this feature as a port of JupyterLab's ability to execute code and write to a file to Jupyter Server, triggered by a button. The input is file+kernel+cellid and the process is that Jupyter Server establishes a websocket connection and writes the result of the code run to the file continuously.JupyterLab will get the updates through Jupyter's RTC feature.
EDIT:
This picture simply illustrates my thoughts. Hope this helps. Thank you all.
from jupyter_server.
I made some progress towards restoring notebook state, using jupyverse/jpterm:
Peek.2023-11-09.11-29.mp4
from jupyter_server.
And here is a demo showing full notebook state recovery, including widgets:
Peek.2023-11-10.11-29.mp4
from jupyter_server.
Having a ydoc in the kernel is great, because this will work with kernel gateway and enterprise gateway too
from jupyter_server.
I realized that the proposal needs to be adapted to the latest Jupyter Server changes (authorization, updated kernel web socket handler, etc.)
Doesn't the general principle remain the same. Authentication and Authorization should not come in the picture, does it? The new kernel websocket handler is meant to be more easily extensible, so I guess it should not impact the validity of your proposal.
Will the May 25th Jupyter Server call be appropriate to discuss the proposal?
Sounds good. I will join and we will chat with the people online.
from jupyter_server.
Hey folks, I'm going to be out-of-the-office next week and will miss the Jupyter Server meeting (6/29).
If folks want to still meet and discuss, please feel free to do so!
Otherwise, I'll be back for the next meeting on 7/6. Until then, I'll work on setting up a new repo and drafting a loose roadmap of the work ahead of us. Cheers everyone!
from jupyter_server.
Thank you @skukhtichev for the great demo during JupyterCon and for opening the discussion to upstream the work you have been doing.
I would be great to have a chat during one of our dev calls. When would you be able to join?
cc/ @Zsailer @kevin-bates
from jupyter_server.
Haules
from jupyter_server.
@echarles I will be happy to discuss the proposal during the dev talk. After @davidbrochart presentation about Jupyter Server at JupyterCon, I realized that the proposal needs to be adapted to the latest Jupyter Server changes (authorization, updated kernel web socket handler, etc.). I want to dive deeper into the server's code and update the proposal. Will the May 25th Jupyter Server call be appropriate to discuss the proposal?
from jupyter_server.
@skukhtichev, @echarles - I've gone ahead and added this to Thursday's agenda. See you there!
from jupyter_server.
@kevin-bates I think @skukhtichev was mentioning 25th (next week), not 18th (this week)
from jupyter_server.
@echarles yes, @kevin-bates Could we reschedule a discussion to the next week (May 25th)?
from jupyter_server.
Doesn't the general principle remain the same. Authentication and Authorization should not come in the picture, does it? The new kernel websocket handler is meant to be more easily extensible, so I guess it should not impact the validity of your proposal.
Yes, the principle remains the same. There is a limitation with establishing websocket sessions between the Browser and Jupyter Server. There is only one session could be established for a user. If the user opens the same notebook in the new browser tab, then then the previous web socket session is closed. Currently it is similar for other users who are connecting to the same kernel. The authorization logic sets user-specific cookies, so it should be possible to distinguish between users connecting to the same kernel. Implementing this approach will allow support for multiple users and will not interfere with the collaboration feature.
from jupyter_server.
@kevin-bates I think @skukhtichev was mentioning 25th (next week), not 18th (this week)
I'm sorry I missed that. I've updated the agenda such that this discussion is slated for next week (May 25).
from jupyter_server.
I would also like to join this conversation (will attend the discussion next week). I have worked on similar concept and would like to collaborate on this effort.
from jupyter_server.
Hi @parul100495 - thank you for sharing your interest in helping! See you next Thursday.
(For the sake of others, the Server/Kernels team meeting is open to anyone - all are welcome - no participation required.)
from jupyter_server.
PS: unfortunately I won't be able to join this week call. Excited to see any progress on this feature.
from jupyter_server.
Hi all, I recently discovered this proposal and am interested in learning more about how I could help.
I work in a plant breeding lab that collaborates frequently with international researchers. These individuals don't always have the most reliable internet connection, and so oftentimes face the frustration of losing their work when in-progress cells are "canceled" due to a network interruption. The way I read @skukhtichev's proposal, it ought to also cover situations like these (i.e. a user reconnects to their notebook after being disconnected for some time, and is able to see their code cells continuing to run and not lose data).
I've talked briefly with @davidbrochart on this issue related to the new kernels REST API, as I believe it would also be a potential solution to this problem.
What would be the best avenue for a volunteer to dedicate time to assisting with this? I saw that this proposal was discussed at last week's server meeting.
from jupyter_server.
@Zsailer Just to double-check, will those 20 minute blocks begin at tomorrow's meeting, or next week's?
from jupyter_server.
I would like to relay here a comment appearing in the meeting note of an important point that is missing in the above summary (thanks for it Zach): the state of the kernel waiting for an user input (e.g. a Python code using input
built-in function).
from jupyter_server.
I think that the issue with input
is not only that we don't handle it in the YNotebook, but also that the frontend doesn't treat it as collaborative text. But it is indeed a small text editor, so it should be treated just like a cell if we want it to be collaborative, i.e. seen and/or editable by other users.
from jupyter_server.
I made some progress towards restoring notebook state, using jupyverse/jpterm:
Peek.2023-11-09.11-29.mp4
Forgive me for not following the development of the project for too long. This is exciting. 👏
Is it because you open another client for writing(as collaborative)? Or have we implemented Jupyter Server to write directly to files or replay messages? (And replaying the message doesn't solve the problem of the output not being saved, it just offers the possibility of a delayed save)
UPDATE:
There is a YDoc representing the notebook but not in the kernel. This YDoc lives in the (jupyverse) server and in the (jpterm) client.
I'm currently working on doing the same for widgets, and in this case yes, there will be a YDoc representing the widget in the kernel, in the server and in the client. The widget will synchronize between the kernel and the server using the Comm protocol, and between the server and the client using a WebSocket. It will allow widget state restore as well.
jupyterlab/jupyterlab#2833 (comment)
It seems to go further with my comment above that using Y-CRDT
(YDOC
) to establish consistency, we can develop a wide variety of applications 👍
from jupyter_server.
Related Issues (20)
- The Shut Down and Log Out menu items disappeared between 2.12.5 and 2.13.0 HOT 2
- Invalid page_config.json can crash server HOT 1
- interface for switching between jupyter apps
- Add constraints fore saving/reading files
- Add an HTML endpoint for `/api` with interactive docs HOT 3
- Slower atexit methods do not run to completion HOT 2
- Running Out of memory in a process can cause the extension handler to crash HOT 1
- Fix outdated security documentation HOT 3
- Cookie security docs are misleading
- extensions cannot render error pages without static_paths configuration HOT 1
- Environment variables passed to logs should be sanitised out of the logs
- Tests are failing on Windows HOT 1
- Pass session ID during Websocket Upgrade connection when using Gateway.
- jupyter server crashes and fails to render files or file ops if a malformed `pyproject.toml` is found in a user's homedir HOT 4
- Allow-list for server extensions? HOT 1
- how to Use PasswordIdentityProvider.allow_password_change HOT 2
- Add the ability to dynamically capture server logs from a running server.
- Hash should optionally be included in the `save` model output
- Jupyter lab running on cluster uses up all available RAM when multiple notebooks are opened.
- Registering new file formats for the ContentsManager
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jupyter_server.