-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
json-meta://
store
#9551
base: master
Are you sure you want to change the base?
json-meta://
store
#9551
Conversation
Before Nix used SQLite for the Nix store (and after it used Berkeley DB), it used flat files to store metadata (b0e92f6). Apart from performance and disk space overhead, the main problem was ensuring transactional semantics for the referrers mapping (especially needed for garbage collection). But that might not be a problem for some use cases.
I'm not convinced by this argument, since SQLite is one of the most-used and best-tested pieces of software out there. Certainly better tested than an ad hoc metadata store, even if it contains JSON. |
Yes, I didn't implement any of that, on purpose. References are just stored "forward" as part of the valid path info JSON. This makes adding new store objects easy / well isolated, and everything else a pain in the ass :). Very intentional!
Yes no hate against SQLite, it is fantastic software. My argument is not that SQLite could be better, but that a higher performance database of that sort must unavoidably sacrifice having a nice normal form. This however does have a normal form (packaged json, in order, assuming we didn't screw anything up). That has some nice problems SQLite can not have. (An in-kernel SQLite, where it was impossible to see the underlying bytes but just use the abstract interface, would also help. Maybe someday we'll have Nix on SQL-supporting mainframes and can do thing that way. But this also just kicks the can down to "what is the best way to have secure on-disk representations of kernel data structures", which is a question the likes of dm-verity, https://github.com/project-machine/puzzlefs, squash-fs, etc. are all trying to answer.) |
|
I am hoping to get there little by little via the various "shuffle around the |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2023-12-08-nix-team-meeting-minutes-110/36721/1 |
This sounds similar to me do the |
Seems pretty much the same? I don't like the narinfo format and rather use JSON, but that's small potatos. |
Cool. I guess a "benefit" of reusing |
@arianvp But I would like to have binary caches also use the JSON format :). We can upload both to binary caches for backwards compat. For example, at some point I need to propose what I think is a stronger version of the |
Yes, we should allow .narinfo files to be JSON. I.e. if the first character is |
That works too, if we are OK with suddenly cutting off old Nix from new objects :) (or rather the read side can be flexible like that, but the write side can do two files. Postel's law type stuff.) |
"narinfo" is not descriptive. It contains a bit of nar "info", such as the hash, but the rest is store object info, such as name and references, binary cache info such as file location, and realisation info such as deriver and signatures (if that's not its own category). Query behavior can be modified in the
This can't be the main mechanism, because it's not compatible with existing Nix versions. We should have a transition period where both formats are available. A new file extension helps with that. Wouldn't hurt to have docs (EDIT: of narinfo and the binary cache protocol at the very least) before considering any of this, so everyone can have a good understanding of the domain. EDIT: Consider HTTP |
draft #9348 adds some docs |
6fd36ea
to
0143207
Compare
0143207
to
36a9a6b
Compare
Motivation
Way back in c1a07f9 Nix switched to using SQLite, and for good reason: it makes a lot of operations much faster. But it is still useful to have a simple text file metadata alternative for a few niche use-cases:
Broadcasting: if one wants an store over NFS to a very large number of consumers in a pub-sub manner, it is disadvantageous for all writes to modify the the database file. Separate files per separate "rows" avoids synchronization
IPFS, the janker way: No SQLite, no
.nar
is a quick and dirty way to get a filesystem store representation that a tool like IPFS could mirror pretty well. Of course, I am personally fond of a deeper integration where we do things likeby persisting the JSON into IPFS's native JSON representation rather than just JSON-in-a-file, but I suppose it is good to not let the perfect be the enemy of the good.
High Security Stores: It is hard to audit the contents of a SQLite database. Even if we can be sure that the SQL program only reads out good data, since it is an opaque binary format there may still be opportunities for stenography. More broadly, database performance is in fundamental tension with restricting to normal forms. A store that has everything in plane text is easy to hand-audit, and therefore better suited to be a secure (albeit slow) store for various purposes.
Separate from the feature itself, I also think this is a good exercise to disentangle a store being "local" from a store using SQLite. Having a second tiny implementation ensures we don't start "over fitting" to SQLite in various ways, e.g. encouraging factoring out the parts of
LocalStore
that don't have to do withSQLite
.This is a little toy store that stores
ValidPathInfo
s andRealisation
s in JSON format in a separate directory. This is an on-disk format that is very easy to work with (easier thannarinfo
line format, I think).Context
TODO tests, but this is a general problem for more store implementations that we need to solve properly once at for all. E.g. #9429 has the same issue.
Priorities
Add 👍 to pull requests you find important.
CC @RaitoBezarius @flokli @ryantm @danielfullmer