Skip to content

SQLite Primary schema (defunct)

Sevoris Doe edited this page Oct 17, 2023 · 1 revision

Primary table

The core table of Norgberg stores all the files in all the workspaces as relative file path primary keys. File paths are stored relative to the workspace folder name, in the form workspace/path/to/file.norg. There must be collision-free. A further column stores just the file name file.norg. This may not be collision-free, in which case you have do defer to resolving using the file path. The basic table also stores timestamps about the time when the file was first ingested and when it was last ingested.

Full schema:

CREATE TABLE notesPrimary (
    filePath       TEXT    PRIMARY KEY ON CONFLICT ABORT
                           NOT NULL,
    fileName       TEXT    NOT NULL,
    firstSeen      INTEGER NOT NULL,
    lastUpdated    INTEGER NOT NULL,
    fileHash       TEXT    NOT NULL,
    fileMetaBody   TEXT,
    fileTextBody   TEXT,
    fileWordLength INTEGER,
    fileMetaID     TEXT    UNIQUE
                           NOT NULL
)
WITHOUT ROWID;
  • "filePath" - the workspace-root relative file path to the file, primary key
  • "fileName" - the pure name of the file.
  • "fistSeen" - timestamp of when the file was first seen for the current entry, in UNIX seconds
  • "fileUpdated" - last time the file state was updated by the system, in UNIX seconds
  • "fileHash" - either an MD5 or AES256 hashing for the file sync system
  • "fileTextBody" - note content, without leading metadata block
  • "fileMetaBody" - metadata header of the file.
  • "fileWordLenght" - the amount of words in the file.
  • "fileMetaID" - UUID type 1 key which can be used to link other files without need to propagate the primary key. Note that this UUID will not be all-time unique. Whenever a note is newly created, and be that because the note had to be dropped and re-cached due to being moved and edited, a new UUID will be assigned.

Any further information is stored in other tables that reference either the primary key of the table, or the metaID.

The primary table is not generally accessible for insert or update queries to protect the core integrity of the database. Updates to the table are rendered through the population system, which receives updates about the creation, movement, and deletion of files. You can implement your own population methods, in which case you can call these methods too.

Workspace roots table

This table stores the absolute file paths of each workspace root folder, along with the workspace name and a device name. This is mainly auxilery information for reconstructing absolute paths while working against the database only. The device schema may change to ensure that the database is integral under syncs against different devices.

Virtual File table

The virtual file table is a table that stores the primary keys of files whose existence is implied by the existence of file links without a valid file link resolution.

CREATE TABLE virtualFiles (
    fileName       TEXT    PRIMARY KEY ON CONFLICT ABORT
                           NOT NULL,
    fileFirstSeen  INTEGER NOT NULL,
    fileLastUpdate INTEGER NOT NULL,
    fileMetaID     TEXT    NOT NULL
)
WITHOUT ROWID;

URI table

The URI table stores any URL link targets in the document.

CREATE TABLE uriTargets (
    uriTarget TEXT    PRIMARY KEY ON CONFLICT ABORT
                      NOT NULL,
    firstSeen INTEGER NOT NULL,
    lastSeen  INTEGER NOT NULL,
    metaID    TEXT    NOT NULL
)
WITHOUT ROWID;

Files table

The Files table stores links to any non-norg file targets, using their file path specification.

CREATE TABLE textFiles (
    pathToFile  TEXT    PRIMARY KEY ON CONFLICT ABORT
                        NOT NULL,
    fileName    TEXT    NOT NULL,
    firstSeen   INTEGER NOT NULL,
    lastUpdated INTEGER NOT NULL,
    metaID      TEXT    NOT NULL
)
WITHOUT ROWID;

Cookie tables

The cookie system maintains its own tables. Modules have their name as their primary key - to help in collision avoidance, this can be the repository name in front of the module name as well, i.e SevorisDoe-Norgberg. The other information stored is the semantic version number. A second table stores for each plugin's primary key the names of the tables the plugin has created in a 1-to-n relationship.

Table names for plugins

Table names for modules follow the convention of moduleName_tablename to insure identification.

Cached information

A selection of core tables are maintained automatically, caching information.

Link resolution

Link resolutions are cached separately and stored here. Link resolutions are modelled as property graph edges in a table. The schema is:

CREATE TABLE concreteFileLinks (
    uuid            TEXT    PRIMARY KEY
                            NOT NULL,
    [from]          TEXT    REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
                                                                 ON UPDATE CASCADE
                            NOT NULL,
    [to]            TEXT    REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
                                                                 ON UPDATE CASCADE
                            NOT NULL,
    originLine      INTEGER NOT NULL,
    originCharStart INTEGER NOT NULL,
    originCharEnd   INTEGER NOT NULL,
    linkType        TEXT    NOT NULL,
    linkSpec        TEXT
)
WITHOUT ROWID;
  • "id" - UUID Type 1 (random plus timestamp) primary key.
  • "from" - foreign key file path of the file which includes the link specification
  • "to" - foreign key file path to which the file points
  • "type" - a string which stores the type of resolved link
  • "from-line", "from-line-start", "from-line-end" - three integers that store where the link is located in the originator file
  • "to-spec" - a string encoding information about where exactly in the file the link points (i.e. a heading or other property)

Virtual links

These are links for which no actual destination file exists in the workspace. They are targeted against so-called "virtual files". Note that virtual files have no path spec and as such, are all assumed to resolve against the same one file.

CREATE TABLE virtualFileLinks (
    uuid            TEXT    PRIMARY KEY
                            NOT NULL,
    [from]          TEXT    REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
                                                                 ON UPDATE CASCADE
                            NOT NULL,
    [to]            TEXT    REFERENCES virtualFiles (fileMetaID) ON DELETE CASCADE
                                                                 ON UPDATE CASCADE
                            NOT NULL,
    originLine      INTEGER NOT NULL,
    originCharStart INTEGER NOT NULL,
    originCharEnd   INTEGER NOT NULL,
    linkType        TEXT    NOT NULL,
    linkSpec        TEXT
)
WITHOUT ROWID;
  • "id" - UUID Type 1 (random plus timestamp) primary key.
  • "from" - foreign key file path of the file which includes the link specification
  • "to" - foreign key virtual file name
  • "type" - a string which stores the type of resolved link
  • "from-line", "from-line-start", "from-line-end" - three integers that store where the link is located in the originator file

unclear if this needs a to-spec. Since the file links are vitual, there is no need to cache any resolution information for faster information retrieval.

File system links

File system links are stored here for quick access.

CREATE TABLE textFilesLink (
    uuid            TEXT    PRIMARY KEY
                            NOT NULL,
    [from]                  REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
                                                                 ON UPDATE CASCADE
                            NOT NULL,
    [to]                    REFERENCES textFiles (metaID) ON DELETE CASCADE
                                                          ON UPDATE CASCADE
                            NOT NULL,
    originLine      INTEGER NOT NULL,
    originCharStart INTEGER NOT NULL,
    originCharEnd   INTEGER NOT NULL,
    linkSpec        TEXT
)
WITHOUT ROWID;

URI Links

Resolves links to URI targets

CREATE TABLE uriTargetLinks (
    uuid            TEXT    PRIMARY KEY
                            NOT NULL,
    [from]                  REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
                                                                 ON UPDATE CASCADE
                            NOT NULL,
    [to]                    REFERENCES uriTargets (metaID) ON DELETE CASCADE
                                                           ON UPDATE CASCADE
                            NOT NULL,
    originLine      INTEGER NOT NULL,
    originCharStart INTEGER NOT NULL,
    originCharEnd   INTEGER NOT NULL,
    linkSpec        TEXT
)
WITHOUT ROWID;

Indexing

Both the "from" and "to" colums are indexed to aid in both link look-up and empower backlink retrieval.

Metadata

There likely will be sets of metadata such as when the link was resolved last.

Regarding internal links

It may be worthwile storing file-internal links. If so, these are modelled as part of the general table, going from the same note to the same note and having internal type resolution.

Task table

The task table stores all the tasks found in the workspace for quick access.

Tasks are stored in at least two tables. One table stores the tasks and assigns them a parsing-stable UUID - other tables store the extensions.

CREATE TABLE taskListing (
    fileID             REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
                                                            ON UPDATE CASCADE
                       NOT NULL,
    lineNumber INTEGER NOT NULL,
    taskLevel  INTEGER NOT NULL,
    taskID     TEXT    NOT NULL,
    PRIMARY KEY (
        fileID,
        lineNumber
    )
    ON CONFLICT ABORT
);

Schema:

  • "fileID" - fereign key on the primary file path
  • "taskID" - UUID Type 1 assigned internally, not persistent
  • "lineNumber", "taskLevel" - two integers storing the task position, allowing for reconstruction of where the task is at

Metadata tables

These tables store a set of metadata for fast querying. These are kept modular separate from the primary table to allow easier schema modifications or extensions if need be.

Title and description table

CREATE TABLE filesMeta (
    file                 PRIMARY KEY
                         REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
                                                              ON UPDATE CASCADE
                         NOT NULL,
    metaTitle       TEXT,
    metaDescription TEXT
)
WITHOUT ROWID;
  • "fileMetaTitle" - the internal metadata title property, if it exists.
  • "fileMetaDescription" - the internal metadata description, if it exists.

Categories table

Stores for each note, what categories it is assigned to (if any). Since one note may belong to n categories, appropiate relationships apply.

CREATE TABLE filesMetaCategories (
    fileID        REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
                                                       ON UPDATE CASCADE
                  NOT NULL,
    Category TEXT
);

Full-text indexing tables

SQLite FTS5 is being used to provide full-text search capabilities on the tables. Since this uses virtual tables, they are populated separately.

The file text follows the previously established schema:

  • "filePath" - the workspace-root relative file path to the file, primary key
  • "fileName" - the pure name of the file.
  • "fileTextBody" - note content, without leading metadata block
  • "fileMetaBody" - metadata header of the file. possibly (these may be unnessecary versus just querying the metaBody content with appropiate structure spec in query)
  • "fileMetaTitle" - the internal metadata title property, if it exists.
  • "fileMetaDescription" - the internal metadata description, if it exists.

These are all FTS5-indexed and can thus be searched as full-text by specifying the appropriate columns on queries.

Tokenizer choice: to be determined. Trigram tokenizers may be worth considering, depending on.