-
Notifications
You must be signed in to change notification settings - Fork 3
SQLite Primary schema (defunct)
The core table of Norgberg stores all the files in all the workspaces as relative file path primary keys. File paths are stored relative to the workspace folder name, in the form workspace/path/to/file.norg
. There must be collision-free. A further column stores just the file name file.norg
. This may not be collision-free, in which case you have do defer to resolving using the file path. The basic table also stores timestamps about the time when the file was first ingested and when it was last ingested.
Full schema:
CREATE TABLE notesPrimary (
filePath TEXT PRIMARY KEY ON CONFLICT ABORT
NOT NULL,
fileName TEXT NOT NULL,
firstSeen INTEGER NOT NULL,
lastUpdated INTEGER NOT NULL,
fileHash TEXT NOT NULL,
fileMetaBody TEXT,
fileTextBody TEXT,
fileWordLength INTEGER,
fileMetaID TEXT UNIQUE
NOT NULL
)
WITHOUT ROWID;
- "filePath" - the workspace-root relative file path to the file, primary key
- "fileName" - the pure name of the file.
- "fistSeen" - timestamp of when the file was first seen for the current entry, in UNIX seconds
- "fileUpdated" - last time the file state was updated by the system, in UNIX seconds
- "fileHash" - either an MD5 or AES256 hashing for the file sync system
- "fileTextBody" - note content, without leading metadata block
- "fileMetaBody" - metadata header of the file.
- "fileWordLenght" - the amount of words in the file.
- "fileMetaID" - UUID type 1 key which can be used to link other files without need to propagate the primary key. Note that this UUID will not be all-time unique. Whenever a note is newly created, and be that because the note had to be dropped and re-cached due to being moved and edited, a new UUID will be assigned.
Any further information is stored in other tables that reference either the primary key of the table, or the metaID.
The primary table is not generally accessible for insert or update queries to protect the core integrity of the database. Updates to the table are rendered through the population system, which receives updates about the creation, movement, and deletion of files. You can implement your own population methods, in which case you can call these methods too.
This table stores the absolute file paths of each workspace root folder, along with the workspace name and a device name. This is mainly auxilery information for reconstructing absolute paths while working against the database only. The device schema may change to ensure that the database is integral under syncs against different devices.
The virtual file table is a table that stores the primary keys of files whose existence is implied by the existence of file links without a valid file link resolution.
CREATE TABLE virtualFiles (
fileName TEXT PRIMARY KEY ON CONFLICT ABORT
NOT NULL,
fileFirstSeen INTEGER NOT NULL,
fileLastUpdate INTEGER NOT NULL,
fileMetaID TEXT NOT NULL
)
WITHOUT ROWID;
The URI table stores any URL link targets in the document.
CREATE TABLE uriTargets (
uriTarget TEXT PRIMARY KEY ON CONFLICT ABORT
NOT NULL,
firstSeen INTEGER NOT NULL,
lastSeen INTEGER NOT NULL,
metaID TEXT NOT NULL
)
WITHOUT ROWID;
The Files table stores links to any non-norg file targets, using their file path specification.
CREATE TABLE textFiles (
pathToFile TEXT PRIMARY KEY ON CONFLICT ABORT
NOT NULL,
fileName TEXT NOT NULL,
firstSeen INTEGER NOT NULL,
lastUpdated INTEGER NOT NULL,
metaID TEXT NOT NULL
)
WITHOUT ROWID;
The cookie system maintains its own tables. Modules have their name as their primary key - to help in collision avoidance, this can be the repository name in front of the module name as well, i.e SevorisDoe-Norgberg
. The other information stored is the semantic version number. A second table stores for each plugin's primary key the names of the tables the plugin has created in a 1-to-n relationship.
Table names for modules follow the convention of moduleName_tablename
to insure identification.
A selection of core tables are maintained automatically, caching information.
Link resolutions are cached separately and stored here. Link resolutions are modelled as property graph edges in a table. The schema is:
CREATE TABLE concreteFileLinks (
uuid TEXT PRIMARY KEY
NOT NULL,
[from] TEXT REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
ON UPDATE CASCADE
NOT NULL,
[to] TEXT REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
ON UPDATE CASCADE
NOT NULL,
originLine INTEGER NOT NULL,
originCharStart INTEGER NOT NULL,
originCharEnd INTEGER NOT NULL,
linkType TEXT NOT NULL,
linkSpec TEXT
)
WITHOUT ROWID;
- "id" - UUID Type 1 (random plus timestamp) primary key.
- "from" - foreign key file path of the file which includes the link specification
- "to" - foreign key file path to which the file points
- "type" - a string which stores the type of resolved link
- "from-line", "from-line-start", "from-line-end" - three integers that store where the link is located in the originator file
- "to-spec" - a string encoding information about where exactly in the file the link points (i.e. a heading or other property)
These are links for which no actual destination file exists in the workspace. They are targeted against so-called "virtual files". Note that virtual files have no path spec and as such, are all assumed to resolve against the same one file.
CREATE TABLE virtualFileLinks (
uuid TEXT PRIMARY KEY
NOT NULL,
[from] TEXT REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
ON UPDATE CASCADE
NOT NULL,
[to] TEXT REFERENCES virtualFiles (fileMetaID) ON DELETE CASCADE
ON UPDATE CASCADE
NOT NULL,
originLine INTEGER NOT NULL,
originCharStart INTEGER NOT NULL,
originCharEnd INTEGER NOT NULL,
linkType TEXT NOT NULL,
linkSpec TEXT
)
WITHOUT ROWID;
- "id" - UUID Type 1 (random plus timestamp) primary key.
- "from" - foreign key file path of the file which includes the link specification
- "to" - foreign key virtual file name
- "type" - a string which stores the type of resolved link
- "from-line", "from-line-start", "from-line-end" - three integers that store where the link is located in the originator file
unclear if this needs a to-spec. Since the file links are vitual, there is no need to cache any resolution information for faster information retrieval.
File system links are stored here for quick access.
CREATE TABLE textFilesLink (
uuid TEXT PRIMARY KEY
NOT NULL,
[from] REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
ON UPDATE CASCADE
NOT NULL,
[to] REFERENCES textFiles (metaID) ON DELETE CASCADE
ON UPDATE CASCADE
NOT NULL,
originLine INTEGER NOT NULL,
originCharStart INTEGER NOT NULL,
originCharEnd INTEGER NOT NULL,
linkSpec TEXT
)
WITHOUT ROWID;
Resolves links to URI targets
CREATE TABLE uriTargetLinks (
uuid TEXT PRIMARY KEY
NOT NULL,
[from] REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
ON UPDATE CASCADE
NOT NULL,
[to] REFERENCES uriTargets (metaID) ON DELETE CASCADE
ON UPDATE CASCADE
NOT NULL,
originLine INTEGER NOT NULL,
originCharStart INTEGER NOT NULL,
originCharEnd INTEGER NOT NULL,
linkSpec TEXT
)
WITHOUT ROWID;
Both the "from" and "to" colums are indexed to aid in both link look-up and empower backlink retrieval.
There likely will be sets of metadata such as when the link was resolved last.
It may be worthwile storing file-internal links. If so, these are modelled as part of the general table, going from the same note to the same note and having internal type resolution.
The task table stores all the tasks found in the workspace for quick access.
Tasks are stored in at least two tables. One table stores the tasks and assigns them a parsing-stable UUID - other tables store the extensions.
CREATE TABLE taskListing (
fileID REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
ON UPDATE CASCADE
NOT NULL,
lineNumber INTEGER NOT NULL,
taskLevel INTEGER NOT NULL,
taskID TEXT NOT NULL,
PRIMARY KEY (
fileID,
lineNumber
)
ON CONFLICT ABORT
);
Schema:
- "fileID" - fereign key on the primary file path
- "taskID" - UUID Type 1 assigned internally, not persistent
- "lineNumber", "taskLevel" - two integers storing the task position, allowing for reconstruction of where the task is at
These tables store a set of metadata for fast querying. These are kept modular separate from the primary table to allow easier schema modifications or extensions if need be.
CREATE TABLE filesMeta (
file PRIMARY KEY
REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
ON UPDATE CASCADE
NOT NULL,
metaTitle TEXT,
metaDescription TEXT
)
WITHOUT ROWID;
- "fileMetaTitle" - the internal metadata title property, if it exists.
- "fileMetaDescription" - the internal metadata description, if it exists.
Stores for each note, what categories it is assigned to (if any). Since one note may belong to n categories, appropiate relationships apply.
CREATE TABLE filesMetaCategories (
fileID REFERENCES notesPrimary (fileMetaID) ON DELETE CASCADE
ON UPDATE CASCADE
NOT NULL,
Category TEXT
);
SQLite FTS5 is being used to provide full-text search capabilities on the tables. Since this uses virtual tables, they are populated separately.
The file text follows the previously established schema:
- "filePath" - the workspace-root relative file path to the file, primary key
- "fileName" - the pure name of the file.
- "fileTextBody" - note content, without leading metadata block
- "fileMetaBody" - metadata header of the file. possibly (these may be unnessecary versus just querying the metaBody content with appropiate structure spec in query)
- "fileMetaTitle" - the internal metadata title property, if it exists.
- "fileMetaDescription" - the internal metadata description, if it exists.
These are all FTS5-indexed and can thus be searched as full-text by specifying the appropriate columns on queries.
Tokenizer choice: to be determined. Trigram tokenizers may be worth considering, depending on.