Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future of read-only notebook deployments #458

Open
maximlt opened this issue Nov 27, 2024 · 17 comments
Open

Future of read-only notebook deployments #458

maximlt opened this issue Nov 27, 2024 · 17 comments

Comments

@maximlt
Copy link
Contributor

maximlt commented Nov 27, 2024

Many examples have what we call a read-only notebook deployment. This is a pretty unique feature of the deployment platform we use at te moment, Anaconda AE5, that lets us configure a deployment (via anaconda-project, with a lock file, data to download, env vars, etc.) and allows users to run it as a normal Jupyter Notebook (that they can't modify in case they had some bad intention...). This is useful as most of the examples require a live kernel to experience their full interactivity (HoloViews DynamicMaps, Panel callbacks, etc.).

Image

Since this feature isn't really supported anymore in AE5, I'm opening this issue to discuss the options available. I'd like to add that I like a lot that feature of AE5 and wish we could provide a similar (if not better) experience to visitors of the site, as they can get an authentic feeling of interacting with the tools in a notebook.

  1. Don't do anything: When the feature stops working we just put the deployments down and move on (examples are not required to have a deployment)
  2. Replace with a standard Panel app: As we already support that, we could replace a notebook deployment with a Panel app deployment (with at least one .servable() call somewhere in the notebook). While this is suited for notebooks that gradually build an app, it is less appropriate for those that have a data analysis narrative.
  3. Replace with a notebook Panel app: Since version 1.4 Panel allows deploying a notebook that doesn't contain any .servable() call, instead, the whole notebook content is executed and served. The content can also be re-arranged manually (it's using the new EditableTemplate) and is automatically recorded as metadata in the notebook (https://panel.holoviz.org/how_to/notebook/layout_builder.html). When served, this should provide an experience quite close to visualizing a fully executed notebook in a live kernel. I wonder if there's a way to also include code cells as text in the dashboard?
  4. Replace with JupyterLite (or similar Python in the browser solution): We have (too) many examples that rely on numba (~25 at the moment that have it in their lock file), mostly because of datashader, and we know that numba isn't yet available on WebAssembly. There are likely other cases that do not work so well, or at all, in the browser (Dask?). There are also questions around how to build such a deployment for the browser that respects the lock file of an example (look into xeus-python + emscripten-forge https://jupyterlite.readthedocs.io/en/stable/quickstart/deploy.html#using-the-xeus-python-kernel-and-emscripten-forge), and examples that rely on large data files.
  5. Replace with a cloud notebook service (Binder, Anaconda Cloud, Google Colab): We'd need to define our list of requirements, like ability for the service to respect our config/lock file (or us to write a compatibility layer for the service), enough storage for the data the projects need to download, enough free compute for the users to run a few examples, ability to automate deploying the examples, etc.
@jbednar
Copy link
Contributor

jbednar commented Nov 27, 2024

I vote against 1; I think we can do something! 2 is ok as a fallback.

I'd like to see a version of 3 as a mode that's quite notebook-like but designed for reading rather than running, and which automatically executes all the cells in the notebook. And yes, shows the code cells, at least optionally. The result should be a nice-looking "live document".

I'd like to see 4 for any example that supports it, though it may be some time before we get all the annoying bits ironed out.

I don't mind if we do 5, but it seems like we'll always be exposed to a risk of that service disappearing, so it wouldn't be my preferred option.

Well, I guess I could vote for 5, but only if the same service could handle our deployed apps as well. I.e. I'd prefer we have a single system handling both our notebook and our app deployments, so that we have only a single main set of issues to debug and monitor. 2-4 all achieve letting the same system handle all notebooks and apps, and I wouldn't want 5 unless it too could handle all of them.

@MarcSkovMadsen
Copy link
Collaborator

MarcSkovMadsen commented Nov 28, 2024

To put things on the edge, I would say I hope there is no future of read-only notebook deployments 😄

I hope the future for technical, code orienteret writing is via markdown format. You are writing text with some code and code output. And the best format for that is markdown. Markdown is also easier to work with in git and via LLMs.

  1. For single or few page documentation Panel should be able to convert the markdown to a beautiful document and serve it as a live, interactive document via the Panel server or as a static page with live interactivity via pyodide. And it should be able to enable code cells and highlight specific lines of code just as in MkDocs. For the pyodide cells it should be possible to edit them.

  2. For larger sites Panel should be able to make documentation generated via MkDocs (or similar documentation framework) come alive. One option should be via a live Panel server. Another option via Panel pyodide code cell plugins. Again it should be able to hide/ show code cells,. And for the pyodide version it should be possible to edit the cells.

There is a lot of inspiration to be found in what is possible in the Javascript ecosystem, the Quarto/ Shiny ecosystems and Py.Cafe.

@maximlt
Copy link
Contributor Author

maximlt commented Nov 28, 2024

You are writing text with some code and code output. And the best format for that is markdown.

Unfortunately, I think your message Marc is off-topic. This repository contains examples users should be able to download and run locally. This is similar to the cookbooks from Project Pythia, newer and infrastructured differently (one repo per project), that also relies heavily on notebooks because a notebook is the best format to author and share a document that contains both text and executable code.

We've talked recently about MkDocs (and you brought up Quarto and Py.Cafe) in another issue (panel extension). Please, please, please, open a separate issue if you want to discuss more how we can evolve our documentation. We are here strictly discussing how to deploy a notebook.

@MarcSkovMadsen
Copy link
Collaborator

MarcSkovMadsen commented Nov 28, 2024

I believe I'm trying to view things from helicopter perspective before finding a solution for the specific problem. Should you even solve notebook readonly problem in 2025?

I tried to respond to the title "Future of read-only notebook deployments" that to put things on the edge I think it should have no future.

I believe the first questions to ask are:

  • Are notebook examples still relevant as technical documentation including tutorials in 2025?
    • Personally I believe Markdown is a better format for authoring and sharing a document that contains both text and executable code.
    • Personally I think the beautiful documentation you get for example via MkDocs is more valuable if you want to communicate your frameworks.
  • How would you like to enable users to write technical documentation including tutorials using HoloViz frameworks in 2025?
    • I see HoloViz Topics as a showcase for that approach.

@jbednar
Copy link
Contributor

jbednar commented Nov 28, 2024

@maximlt , I thought about making the same comment as you, but then I recalled that some sites are maintained as markdown yet still do allow downloading as notebooks either on the fly or generated from the markdown, e.g. https://scikit-image.org/docs/stable/auto_examples/data/plot_3d.html .

Here we're all suffering from the original sin of Jupyter Notebooks, i.e. that they contain editable text but are not stored in an editable (and diffable, version-controllable, etc.) format. We could make the choice that we as a group focus on .md while allowing visitors to download notebooks (by converting from .md) and contributors to submit notebooks (which we would then convert to .md). So that part of the comments here is relevant to examples.holoviz.org.

But still, Maxime's point that the storage format for our examples is a different topic still holds, since no matter how we store them, we have to decide whether and how to continue providing a way to run them "live".

@maximlt
Copy link
Contributor Author

maximlt commented Nov 28, 2024

But still, Maxime's point that the storage format for our examples is a different topic still holds, since no matter how we store them, we have to decide whether and how to continue providing a way to run them "live".

Yes exactly. We could have our docs written in any kind of format and generated by any static site generator. That's mostly a separate topic with enabling users to run the examples online. I say "mostly" as I am aware there are ways to generate more interactive docs (pyodide, thebe, etc., Quarto must also have solutions for that); I doubt any of them would be an appropriate fit for example.holoviz.org (examples requiring a complex set of dependencies, large datasets, etc.). Happy to be proven wrong if someone comes up with a solid proposal to adopt one of these solutions.

@maximlt , I thought about making the same comment as you, but then I recalled that some sites are maintained as markdown yet still do allow downloading as notebooks either on the fly or generated from the markdown, e.g. https://scikit-image.org/docs/stable/auto_examples/data/plot_3d.html .

To be accurate, the original file is a Python file (https://github.com/scikit-image/scikit-image/blob/main/doc/examples/data/plot_3d.py) that is then converted to rst to build the site with an extra conversion to a notebook (I assume the gallery extension does all of that).

Here we're all suffering from the original sin of Jupyter Notebooks

I'm not suffering much, in fact I'm very happy to use Jupyter Notebooks on this repo and I think it has made it very easy for our recent contributors Jason and Isaiah to get up to speed quickly. When I review a PR, I usually spin up Jupyter and run the example, I can easily experiment with the changes made in other scratch cells, it's a very powerful workflow. Of course, reviewing a notebook diff isn't fun but I've got used to it. We also must store a few evaluated notebooks on the repo (those that can't run on the CI), I don't know if there's a format out there that can replace that.

@MarcSkovMadsen
Copy link
Collaborator

MarcSkovMadsen commented Nov 29, 2024

Do we have any data on how many users actually downloads and runs the notebooks at HoloViz Topics or Panel web site?

I've just not seen my self or my colleagues do that. The people I see read it. Maybe they copy paste some code. I now for a fact that I've been contributing lots of fixes to Panel notebook experience which is obvious to me but never reported by anyone.

I know for a fact that every time I've tried downloading and using the HoloViz tutorial there has been technical issues and I've given up. Its been a long time since I did. So the experience is probably better today. The Panel template shown on the front page is the outdated green color we left years ago though.

So my assumption is that not a lot of people actually use the notebooks and a better looking and easier editable format would be better.

@maximlt
Copy link
Contributor Author

maximlt commented Nov 29, 2024

Do we have any data on how many users actually downloads and runs the notebooks at HoloViz Topics or Panel web site?

No. It might be possible to track the number of clicks with GoatCounter Events (https://www.goatcounter.com/help/events).

I've just not seen my self or my colleagues do that. The people I see read it. Maybe they copy paste some code.

I learnt how to use Panel running the panel examples command that is now removed, it downloaded locally all the example notebooks. I thought it was very useful. Your mileage may vary!

I now for a fact that I've been contributing lots of fixes to Panel notebook experience which is obvious to me but never reported by anyone.

Could you tell us more about these bugs you fixed? That could help improve the experience on this site.

I know for a fact that every time I've tried downloading and using the HoloViz tutorial there has been technical issues and I've given up. Its been a long time since I did. So the experience is probably better today.

We usually work on the tutorial around SciPy. The vast majority of the ~50 people in the room manage to run it locally without any problem (I remember helping someone who had a broken conda installation on her Windows laptop).

The Panel template shown on the front page is the outdated green color we left years ago though.

Since it wasn't very clear to me, @MarcSkovMadsen is referring to this page. Indeed, that's an old screenshot.

Image

So my assumption is that not a lot of people actually use the notebooks and a better looking and easier editable format would be better.

I have a different view. I don't think notebooks are going anywhere in the Python space (I can back this up with data if you want), they're here to stay for a while. We should make it easier to run our stuff in notebooks, since we have first-class support for them. This is especially important for users who are doing data analysis, a group that doesn't fully intersect with users building Panel apps. Indeed, my focus is on HoloViz as a whole.

I'm not sure what you mean by "better looking", since that isn't related to the notebook format but to the static site generator. I still have a hard time understanding why there's so much focus on "better looking" in our discussions. It's not like we're using an older theme like Alabaster. Presumably, the main sites HoloViz users visit the most when writing Python code all have a similar looking theme (e.g. Pandas, Numpy, Matplotlib, Xarray, Bokeh, Altair, Seaborn). So what is suddenly so wrong with us?
Again, this is off-topic and I encourage you to open an issue. To convince others, you could run a survey asking our users how we could improve the documentation.

@MarcSkovMadsen
Copy link
Collaborator

MarcSkovMadsen commented Nov 29, 2024

I think our context and vision is just a bit different.

I've never been able to successfully download and run HoloViz examples in Panel or elsewhere. So I don't think so much about them. But its also been years since I tried. So the story today is probably much better.

Regarding bugs. I've contributed bug fixes to Panel for years. Especially for the pyodide powered notebooks I've often felt like I was the only one really trying to make them work. I seldom see or hear about users using them though.

Regarding "better looking". I just think we are comparing to different ecosystems. I'm not comparing to Pandas, Numpy, Matplotlib, Xarray or Bokeh. I'm comparing to Plotly/ Dash, Shiny/ Quarto/ Posit Ecosystem, Streamlit, Gradio, Marimo, Solara, AnyWidget, Altair/ Vega, React, Superset, Power BI, Tableau, Grafana etc. which for many are also a part of larger ecosystems that provides visualization frameworks and different frameworks for exploring data, creating interactive documentation etc. I'm comparing to them because thats the alternatives I see colleagues or users in the community transition to if they don't find what they need in our ecosystem. And I do see people switching very quickly if there is just a tiny bit of friction or the look and feel is better elsewhere.

I don't think there is suddenly something wrong or just something wrong. There are many things I would like to help improve though.

@jbednar
Copy link
Contributor

jbednar commented Nov 29, 2024

I learnt how to use Panel running the panel examples command that is now removed, it downloaded locally all the example notebooks. I thought it was very useful.

I always thought that was a good approach to get people started. It mainly applies to people genuinely diving in, rather than getting some tip or pointer on a specific problem, but for that case, it was a solid way to get everything ready to go.

I have a different view. I don't think notebooks are going anywhere in the Python space (I can back this up with data if you want),

There are at least ten million notebooks on Github, and 40 million downloads of the Jupyter extension for VSCode, so I'd think anyone who ignores Jupyter does so at their peril. :-)

@MarcSkovMadsen
Copy link
Collaborator

MarcSkovMadsen commented Nov 30, 2024

Regarding notebooks it's for sure on of this ecosystems strong selling points. And the flexibility one of the things that interest me. I'm an owner of a JupyterHub and trying to make it as great as possible.

But I'm very often not able to run Panel or HoloViz in VS Code notebooks - as are my colleagues. And there are lots of users having issues using the ecosystem in notebooks too. There are just fewer technical issues when running on a server.

I just don't see people sitting down for hours running notebooks to learn - at least not in a business setting. I actually hoped they would - but that is not what I see.

What I see is that they expect to be able to go into a high quality web site, quickly find the code they are looking for, copy-paste it, create something with a modern look and feel and move on.

For us dataviz is a passion and we love solving complex problems with it. For most other people it's just one of many tools in their busy lives.

@maximlt
Copy link
Contributor Author

maximlt commented Nov 30, 2024

Regarding bugs. I've contributed bug fixes to Panel for years. Especially for the pyodide powered notebooks I've often felt like I was the only one really trying to make them work. I seldom see or hear about users using them though.

Oh of course I know you contributed a ton of bug fixes! Ok so with "pyodide powered notebooks" I assume you mean those made available to run with JupyterLite. It is true that I don't use them much (easier for me to run things locally) but without any real data we can't know for sure. And my guess is that the vast majority of users who encounter an error don't report it to us.

Regarding "better looking". I just think we are comparing to different ecosystems. I'm not comparing to Pandas, Numpy, Matplotlib, Xarray or Bokeh. I'm comparing to Plotly/ Dash, Shiny/ Quarto/ Posit Ecosystem, Streamlit, Gradio, Marimo, Solara, AnyWidget, Altair/ Vega, React, Superset, Power BI, Tableau, Grafana etc. which for many are also a part of larger ecosystems that provides visualization frameworks and different frameworks for exploring data, creating interactive documentation etc. I'm comparing to them because thats the alternatives I see colleagues or users in the community transition to if they don't find what they need in our ecosystem. And I do see people switching very quickly if there is just a tiny bit of friction or the look and feel is better elsewhere.

One note, most of the tools you mentioned are backed by a startup or larger company, you know it's different for HoloViz and the tools I mentioned (e.g. pandas, xarray, bokeh). So I'm not too surprised the websites look differently! If we had some millions to spend, I'm pretty sure we'd finally get some nice landing pages.

I don't think there is suddenly something wrong or just something wrong. There are many things I would like to help improve though.

Thanks for clarifying that, as my impression from this issue and the other one on copier-template-panel-extension was not that (you wrote there "run away from sphinx" in capital letters).

But I'm very often not able to run Panel or HoloViz in VS Code notebooks - as are my colleagues. And there are lots of users having issues using the ecosystem in notebooks too. There are just fewer technical issues when running on a server.

I think the Jupyter Notebook experience has improved over the years (still some issues of course, as always). VSCode issues are very unfortunate since there are more and more people using it. This doesn't mean we need to abandon supporting notebooks.

I just don't see people sitting down for hours running notebooks to learn - at least not in a business setting. I actually hoped they would - but that is not what I see.

Maybe not all but some for sure, we see that when we give the tutorial as Scipy or other places (a share of the audience comes sponsored by their companies).

What I see is that they expect to be able to go into a high quality web site, quickly find the code they are looking for, copy-paste it, create something with a modern look and feel and move on.

Diataxis has helped us articulate this better, separating the learning phase (tutorial, explanation) from the working phase (how-to, reference). examples.holoviz.org is a bit special, I'm not sure it falls into any of these categories (maybe tutorial but that sounds a bit like a stretch), instead, it's just a collection of examples showing how the HoloViz tools can be used to support solving analytical problems in various domains. I don't expect users to randomly copy/paste code from there, in fact, I strongly recommend against it! I expect them to get inspired and to gain more confidence they can adopt these tools in their workflow. For instance, if you're someone working in a geospatial company, and see all the examples we have in this domain, that hopefully shows you that there's a good chance these tools will be useful to you! This site is always useful for getting funding, making it easy for us (it could be anyone) to demonstrate certain features in certain domains, and grant organizations to assess the ecosystem. The site also allows us to publish examples showing new features after completing a project, Demetris is currently adding neurosciences examples with features developed thanks to the CZI grant.


Marc, I still encourage you to open another issue somewhere else since I think we diverged quite a bit from the original intent of this one, and that this general topic of docs/notebooks seems very important to you. One approach you could suggest might be to separate HoloViz and Panel, if Panel has constraints (start-up-backed alternatives) that force it to act differently compared to other HoloViz tools.

@maximlt
Copy link
Contributor Author

maximlt commented Nov 30, 2024

Coming back to the core issue here after this long side discussion :)

I'd like to see a version of 3 as a mode that's quite notebook-like but designed for reading rather than running, and which automatically executes all the cells in the notebook. And yes, shows the code cells, at least optionally. The result should be a nice-looking "live document".

I'm a bit confused by "designed for reading rather than running, and which automatically executes all the cells in the notebook". Isn't it how it already works? As a visitor of a Panel notebook dashboard, you can't decide which cells to run, it gets fully executed when you visit it (at least that's how I think it works).

I don't mind if we do 5, but it seems like we'll always be exposed to a risk of that service disappearing, so it wouldn't be my preferred option.

Well, I guess I could vote for 5, but only if the same service could handle our deployed apps as well. I.e. I'd prefer we have a single system handling both our notebook and our app deployments, so that we have only a single main set of issues to debug and monitor. 2-4 all achieve letting the same system handle all notebooks and apps, and I wouldn't want 5 unless it too could handle all of them.

Project Pythia lets its visitors run the cookbooks on Binder. Ignoring any potential technical hurdle, that's an option I'd consider too, as it's free for us, well-known, and doesn't require any sign-in. Finding a single deployment solution that is a good fit for our two use cases (panel apps and notebooks) might be too constraining, even if I agree with you that it'd be nice to have!

@droumis
Copy link
Contributor

droumis commented Dec 5, 2024

Another suggestion:

From any of our notebook links like https://github.com/holoviz-topics/examples/blob/main/walker_lake/walker_lake.ipynb, take the part starting with holoviz-topics and add it to https://colab.research.google.com/github/ to get

https://colab.research.google.com/github/holoviz-topics/examples/blob/main/walker_lake/walker_lake.ipynb

This results in a direct link to open a runnable notebook in colab.

see this for info.

@maximlt
Copy link
Contributor Author

maximlt commented Dec 5, 2024

@droumis Can you build an environment from anaconda-project's lock file in Colab? If yes, great! If not, strong -1 for me!

@droumis
Copy link
Contributor

droumis commented Dec 5, 2024

Not sure.. but this seemed to worked in colab:

!pip install -q condacolab
import condacolab
condacolab.install()
!conda install anaconda-project

I guess it's also a question of replacing anaconda-project with pixi and whether pixi would work on colab

@maximlt
Copy link
Contributor Author

maximlt commented Dec 5, 2024

I see you made changes to link to Colab in #454. I'm not thrilled about it tbh, as I've spent too much time trying to rationalize and fix the whole infra here, it was quite a mess with projects implemented differently (some had no lock file). Not too long ago we were not even merging any new project (Grabcut, Stable diffusion) as there was no guarantee that what was on the repo/site/deployment was in sync. It's getting better and I would like to avoid introducing again differences between projects. And it's not just for my own sanity, if we can't install the right dependencies for a project where it's going to run, we can't ensure a good user experience (e.g. for Colab it may work well now as the dependencies are all the latest, in 6 months they will be different enough to have warnings, in a year something will break).

I'd change my mind if we find there's a reasonable way on Colab to: install from a lock, download the datasets, and monitor that it works as expected over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants