Create new documents and make them collaborative.
⚠ This example only works on JupyterLab v3.6 or higher
Before starting this guide, it is strongly recommended to look at the documentation, precisely the section of Documents
In JupyterLab, we refer to a document a type of file stored on disk. These files are represented in the frontend by a Context
which is the bridge between the file metadata (like the file path) and its model (the actual document content). Document IModel
represents the file content while IDocumentWidget
is a view of the model.
Developers can provide new extensions to support additional documents (or replace existing implementation). A document is defined by three elements:
- A model: it contains and handles the document content. It should implement the
IModel
interface. And in particular it should contain the document content in asharedModel
- so you will need to define two new model classes. - A view: it defines a Widget to be included in the frontend for the user to view and interact with the document.
- A file type: this is a dictionary of attributes allowing JupyterLab to map a file or a content with a mimetype to available model(s) and view(s).
The model, the shared model and the view will be provided through new factories and the file type will be registered directly.
For that you will need to access the DocumentRegistry
to register new FileType
s, models and views. This way, when opening a new file, the DocumentManager
will look into the file metadata and create an instance of Context
with the right model for this file. To register new documents, you can create factories, either a IModelFactory
for the model and/or a IWidgetFactory
for the view.
The shared model needs to be registered only if your file must be collaborative. For that you will need to register it in the ICollaborativeDrive
token provided by the @jupyter/docprovider
package.
Packaging note: when using an optional external extension (here
@jupyter/docprovider
fromjupyter-collaboration
), you must tell JupyterLab to include that package in the current extension by adding the following configuration inpackage.json
.:
// package.json#L108-L113
"sharedPackages": {
"@jupyter/docprovider": {
"bundled": true,
"singleton": true
}
}
Factories are objects meant to create instances of the suitable widget/model given a file. For example, when the DocumentManager
detects that the file is a notebook, it uses the notebook widget factory to create a new instance of NotebookPanel
. On the other hand, if you want to make a new IModel
(model) or IDocumentWidget
(view) for a specific file type, you have to create a factory and register it to the DocumentRegister
. When registering a factory, you tell the DocumentManager
that you added a new model or widget for a specific file type. Then, the DocumentManager
will use those factories to create instances of the new model or view.
The easiest way of creating a new widget factory is extending from the ABCWidgetFactory<T, U>
and overwrite its method createNewWidget
. createNewWidget
is called to create a new IDocumentWidget
for a given file; usually by extending DocumentWidget
.
If you inherits from DocumentWidget
, it needs the document context and a widget - content
. The content is the main sub widget of the DocumentWidget
(you can find more information on the section for the Document Widget).
For example, in this extension:
// src/factory.ts#L31-L38
protected createNewWidget(
context: DocumentRegistry.IContext<ExampleDocModel>
): ExampleDocWidget {
return new ExampleDocWidget({
context,
content: new ExamplePanel(context)
});
}
On the other hand, to create a model factory, you need to implement the interface IModelFactory<T>
specifying the name of your model, its content type and its format.
For example, in this extension, we define ExampleDocModelFactory
:
// src/factory.ts#L44-L45
export class ExampleDocModelFactory
implements DocumentRegistry.IModelFactory<ExampleDocModel>
with name:
// src/factory.ts#L52-L54
get name(): string {
return 'example-model';
}
content type:
// src/factory.ts#L61-L63
get contentType(): Contents.ContentType {
return 'exampledoc' as any;
}
and format:
// src/factory.ts#L70-L72
get fileFormat(): Contents.FileFormat {
return 'text';
}
At the same time, you need to implement the method createNew
. The DocumentManager
will call this method when opening a file that uses your custom document model.
// src/factory.ts#L111-L115
createNew(
options: DocumentRegistry.IModelOptions<ExampleDoc>
): ExampleDocModel {
return new ExampleDocModel(options);
}
When registering a new document, first of all, you need to know the file type of your new document model. If the file type is already registered, you won't need to register it again. You could register a new DocumentModel
for an existing file type. If the file type you want to support is not registered, you will need to register it. To do that, you can use the API addFileType
from the DocumentRegistry
. The essential arguments are extensions
to indicate the extension of the file, fileFormat
that specifies the data format, and contentType
to determine if it is a notebook, file or directory.
// src/index.ts#L59-L66
app.docRegistry.addFileType({
name: 'example',
displayName: 'Example',
mimeTypes: ['text/json', 'application/json'],
extensions: ['.example'],
fileFormat: 'text',
contentType: 'exampledoc' as any
});
Once the file type is registered, you can register the shared model in the collaborative drive:
// src/index.ts#L71-L79
if (drive) {
const sharedExampleFactory = () => {
return ExampleDoc.create();
};
drive.sharedModelFactory.registerDocumentFactory(
'exampledoc',
sharedExampleFactory
);
}
The token ICollaborativeDrive
(named above drive
) must be optional as it is provided by the optional
extension jupyter-collaboration
. In order for the new document type to be viewed in non-collaborative
JupyterLab instance, you should support missing ICollaborativeDrive
token; the file will still be viewable
and editable but no collaboration will be enabled.
Then you need to register the model (DocumentModel
) for the new file type. The DocumentModel
represents the content of the file. For example, JupyterLab has two models registered for the notebook. When you open a notebook with the Notebook editor, the DocumentManager
creates an instance of the NotebookModel
that loads the notebook as a JSON object and offers a complex API to manage cells and metadata independently (treats the content of the notebook as a structured data). When opening a notebook with the plain text editor the DocumentManager
creates an instance of the base DocumentModel
class which treats the content of the notebook as a string. Note that you can register multiple models for the same file type. Still, these models are not in sync when the user opens two editors for the same file that use different models (like opening a notebook with the notebook editor and the plain text editor). These editors are not in sync because they use different models. At some point, they will show different content.
To register a new DocumentModel
we can use the API addModelFactory
from the DocumentRegistry
. In this case, we created the model factory without arguments, but you can add the argument that you need.
// src/index.ts#L82-L83
const modelFactory = new ExampleDocModelFactory();
app.docRegistry.addModelFactory(modelFactory);
The last step is to register the view (DocumentWidget
). As with the DocumentModel
, you can register a widget for an existing model or a new model if the existing ones fit your needs. In this case, different widgets using the same model will stay in sync. The DocumentWidget
is the view for the model, and it is only the layer that allows users to interact with the content of the file.
To register a new DocumentWidget
we can use the API addWidgetFactory
from the DocumentRegistry
. The main arguments you need to add to the factory are the widget's name, the name of the model that this widget uses, a list of file types that the widget can open, and the list of file types that the widget is the default view.
// src/index.ts#L87-L92
const widgetFactory = new ExampleWidgetFactory({
name: FACTORY,
modelName: 'example-model',
fileTypes: ['example'],
defaultFor: ['example']
});
// src/index.ts#L104-L104
app.docRegistry.addWidgetFactory(widgetFactory);
The DocumentWidget
is the view that will open when opening the file. The DocumentWidget
contains four main attributes:
context
: The context is the bridge between the file on disk and its content. This context includes all the information about the file and some methods to handle the file. Some other attributes you can find in the context are theDocumentModel
and thesessionContext
, which handles the communication with the backend.title
: Which handles the content of the tab.toolbar
: The editor's toolbar, where you can add different widgets to trigger actions on the document.contentHeader
: This is a panel between the toolbar and the main content area. You can see this header as a second toolbar or as a notification area.content
: The content is the main area of theDocumentWidget
when you add the view for your document.
The DocumentModel
represents the file content in the frontend. Through the model, you can listen to changes in the state of the file like its metadata or some other properties like dirty
that indicates that the content differs from disk, and you can modify and listen to changes on the content. The main methods on the DocumentModel
are toString
and fromString
, every file but the notebook is loaded/saved to disk as a string using these methods.
For non-textual file, the string contains a base-64 representation of the file content.
In JupyterLab v3.1, we introduced the package @jupyterlab/shared-models
to swap ModelDB
as a data storage to make the notebooks collaborative. We implemented these shared models using Yjs, a high-performance CRDT for building collaborative applications that automatically sync. You can find all the documentation of Yjs here.
Yjs documents (Y.Doc
) are the main class of Yjs. They represent a shared document between clients and hold multiple shared objects. Yjs documents enable you to share different data types like text, Array, Map or set, which makes it possible to create not only collaborative text editors but also diagrams, drawings,... .
To sync content between clients, Yjs uses providers. Providers abstract Yjs from the network technology your application uses. They sync Yjs documents through a communication protocol or a database. Most providers have in common that they use the concept of room names to connect Yjs documents. In JupyterLab, we created a package called @jupyterlab/docprovider
with a WebSocket provider that syncs documents through a new end-point (api/yjs
) in the JupyterLab server.
Another critical component of Yjs is Awareness. Every Yjs document has an awareness
attribute that enables you to share user's information like its name, cursor, mouse pointer position, etc. The awareness
attribute doesn't persist across sessions. Instead, Yjs uses a tiny state-based Awareness CRDT that propagates JSON objects to all users. When you go offline, your awareness state is automatically deleted and notifies all users that you went offline.
After a short explanation of Yjs' features, now it's time to start with the implementation.
First you need to define your document data structure. For a text file, it is as
simple as a single string. But as mentioned earlier, it could be more complex. Here
the document represents a cube by its position and text content. It is stored as a JSON with two keys: a position and a content.
Although the position is also a dictionary with keys x and y, we decided to store
both information as string (i.e. position will require parsing to be converted as a dictionary).
That choice is purely at the discretion of the developer. Using a stringified
dictionary is fine if most of the dictionary changes every time. But if most of
the dictionary stay constant (especially if the all dictionary is large), it would
be more performant to use a shared dictionary instead of a shared string.
You can create a new shared model by extending from YDocument<T>
. YDocument is a generic implementation of a shared model that handles the initialization of the YDoc
and already implements some functionalities like the changes history.
In this extension, we created:
// src/model.ts#L354-L354
export class ExampleDoc extends YDocument<ExampleDocChange> {
To create a new shared object, you have to use the ydoc
. The new attribute will be linked to the ydoc
and sync between the different clients automatically. You can also listen to changes on the shared attributes to propagate them to the DocumentWidget
.
// src/model.ts#L358-L359
this._content = this.ydoc.getMap('content');
this._content.observe(this._contentObserver);
A good practice is to implement a data oriented interface for the YDocument
in
order for the consumer to never face Yjs machinery directly. For example, here,
we provide helpers get
and set
to hide the complexity of position
being stringified in the shared object.
// src/model.ts#L390-L399
get(key: 'content'): string;
get(key: 'position'): Position;
get(key: string): any {
const data = this._content.get(key);
return key === 'position'
? data
? JSON.parse(data)
: { x: 0, y: 0 }
: data ?? '';
}
// src/model.ts#L407-L411
set(key: 'content', value: string): void;
set(key: 'position', value: PartialJSONObject): void;
set(key: string, value: string | PartialJSONObject): void {
this._content.set(key, key === 'position' ? JSON.stringify(value) : value);
}
To access the information about the different users connected, you can use the awareness
attribute on the shared model. The awareness
keeps the state of every user as a map with the user's id as a key and a JSON object as the value for the state. This is very useful to shared information such as user cursor positions.
You could add new information to the user's state by using the method setLocalStateField
and access to the state of all users with getStates
.
// src/model.ts#L204-L204
this.sharedModel.awareness.setLocalStateField('mouse', pos);
// src/model.ts#L289-L289
const clients = this.sharedModel.awareness.getStates();
To listen for changes on the state of the users, you can use the method on('change', () => {})
.
// src/model.ts#L50-L50
this.sharedModel.awareness.on('change', this._onClientChanged);
Every time you modify a shared property, this property triggers an event in all the clients to notify them. Still, sometimes you will need to apply a series of modifications as a single transaction to trigger the event only when it has applied all the changes. In this case, you can use the transact
method to group all the operations.
// src/model.ts#L233-L236
this.sharedModel.transact(() => {
this.sharedModel.set('position', { x: obj.x, y: obj.y });
this.sharedModel.set('content', obj.content);
});
Since JupyterLab 3.6, you also need to define and register a Python version of the shared model. This is a consequence of an architecture change. In 3.6, a machine client is implemented in the Jupyter Server itself. That client is responsible for loading, saving and watching the file on disk and to propagate all changes to all clients. This makes collaboration much more robust in case of flaky connection, file rename,... .
In Python, Yjs protocol is implemented in the library y-py
. But as we provide @jupyterlab/shared-models
helpers for the frontend, we
provide jupyter-ydoc
helpers for Python.
A shared model must inherit from YBaseDoc
, here:
# jupyterlab_examples_documents/document.py#L4-L7
from jupyter_ydoc.ybasedoc import YBaseDoc
class YExampleDoc(YBaseDoc):
The shared map is added to the model like this:
# jupyterlab_examples_documents/document.py#L10-L10
self._content = self._ydoc.get_map('content')
Then the method get and set to respectively load and save from the content string must be defined:
# jupyterlab_examples_documents/document.py#L16-L49
def get(self) -> str:
"""
Returns the content of the document as saved by the contents manager.
:return: Document's content.
"""
data = json.loads(self._content.to_json())
position = json.loads(data["position"])
return json.dumps(
{
"x": position["x"],
"y": position["y"],
"content": data["content"]
},
indent=2
)
def set(self, raw_value: str) -> None:
"""
Sets the content of the document from the contents manager read content.
:param raw_value: The content of the document.
"""
value = json.loads(raw_value)
with self._ydoc.begin_transaction() as t:
# clear document
for key in self._content:
self._content.pop(t, key)
for key in [k for k in self._ystate if k not in ("dirty", "path")]:
self._ystate.pop(t, key)
self._content.set(t, "position", json.dumps({"x": value["x"], "y": value["y"]}))
self._content.set(t, "content", value["content"])
#
Finally you need to define the method observe
to register callback for
reacting to a document changes:
# jupyterlab_examples_documents/document.py#L51-L60
def observe(self, callback: "Callable[[str, Any], None]") -> None:
"""
Subscribes to document changes.
:param callback: Callback that will be called when the document changes.
"""
self.unobserve()
self._subscriptions[self._ystate] = self._ystate.observe(partial(callback, "state"))
self._subscriptions[self._content] = self._content.observe(partial(callback, "content"))
#
That Python model is not use if non-collaborative mode. In such case, the frontend method
fromString
andtoString
of the document model are used to load and save the document on disk.
Your new model is almost ready, the last piece required is the registration of the Python shared model (similarly to the registration of the model in the frontend). This is done by adding a Python package entry point:
# pyproject.toml#L29-L30
[project.entry-points.jupyter_ydoc]
exampledoc = "jupyterlab_examples_documents.document:YExampleDoc"