[TOC]
Myba (pronounced: mỹba) is an
open-source, secure, distributed, version-controlled, encrypted
file backup software based on git
,
for Linux, MacOS, BSDs, and possibly even Windows/WSL.
In a world of vice, instability, evergreen browsers, fast-moving markets and near constant supply chain attacks,
it's the best kind of backup utility—a timeless shell script that relies on few, well-tested and stable technologies.
Its only dependencies are:
- a running shell / standard POSIX environment (sh, bash, zsh, dash, ... WSL?),
- gzip
- git (and Git LFS for files sized >40 MB),
- either OpenSSL or GPG for encryption,
all of which everyone should discover most popularly available.
Git does a great job of securely storing and tracking changes and backing up important documents,
it is popular and widely-deployed,
feature-rich,
but it doesn't on its own support encryption, which might be important if the backed-up data
is going to be shared with untrusted (and untrustworthy) third parties
and various intermediary data "processors".
One could most simply set up an encryption-decryption process
consisting of clean
and smudge
git filters issued pre commits and post checkouts,
respectively, but git filters can't encrypt the tracked file paths / filenames,
whereas one might have a want for that, otherwise almost what's the point? 😶
- Version-controlled (git-based) backup of plaintext documents as well as large binary files.
- Automatic text compression for reduced space use.
- Currently using industry-standard quantum-safe strong AES256 encryption of files and paths,
- Familiar git workflow: add (stage), commit, push, clone, pull, checkout ...
- Selective (sparse) checkout of backup files for restoration, efficient size-on-disk overhead.
- Sync to multiple clouds for nearly free by (ab)using popular git hosts.
- Or sync anywhere simply by cloning or checking-out a directory ...
Myba relies on a two-repo solution. On any client, two repositories are created.
One plaintext --bare
repo,
such as in this guide,
with $WORK_TREE
set to the root of your volume of interest,
such as /
or $HOME
(default).
And one encrypted repo that holds encrypted file counterparts.
When you myba commit
some files into the plain repo,
a commit to the encrypted repo is made in the background.
When you myba checkout
, a file is checked out from the
encrypted repo and restored back onto your volume.
When you myba push
your commit history successfully (exit code 0)
to all configured remotes
(any git remote
, such as a special folder or a cloud host),
the local encrypted blobs are deleted to save disk space,
relying on recently-stabilized
git sparse-checkout
and
partial git clone --filter=blob:none
features,
all in all at a minimized and efficient space cost best-suited to backing up
text and configuration files, source code files, documents and pictures,
including all kinds or large binary files
(as much as you can afford to sync to your cloud storage),
all under the assumptions that text files compress well and
that large binaries don't change too often.
Myba is Git + Shell, preconfigured and wrapped as thinly as needed to provide fully encrypted backups that are really easily replicated and synced to the cloud.
<script async src="https://ssl.gstatic.com/trends_nrtr/3826_RC01/embed_loader.js"></script> <script>addEventListener("load", () => window.trends.embed.renderExploreWidgetTo(document.getElementById("trends"), "TIMESERIES", {"comparisonItem":[{"keyword":"/m/02mhh1","geo":"","time":"all"},{"keyword":"/m/05vqwg","geo":"","time":"all"},{"keyword":"/m/0ryppmg","geo":"","time":"all"},{"keyword":"myba","geo":"","time":"all"}],"category":0,"property":""}, {"exploreQuery":"date=all&q=%2Fm%2F02mhh1,%2Fm%2F05vqwg,%2Fm%2F0ryppmg,myba#TIMESERIES","guestPath":"https://trends.google.com:443/trends/embed/"}));</script>- Zero-knowledge cloud sync and storage
- Replace or supplement existing poor, complex, expensive, proprietary solutions (like Veeam, Time Machine, Google One Photos & Drive, iCloud) or software programs with complex, unfamiliar CLI APIs or wide attack surfaces (Bacula, Borg Backup, restic, git-crypt) ...
- Cloud-based serverless virii
- Protocol- and PaaS-agnostic design (AWS to Backblaze B2, GitLab ...). Simply add remote origins or sync (e.g. rsync, rclone) a git folder.
To install everything on a Debian/Ubuntu-based system, run:
# Install dependencies
sudo apt install gzip git git-lfs openssl gpg
# Download and put somewhere on PATH
curl -vL 'https://bit.ly/myba-backup' > ~/.local/bin/myba
chmod +x ~/.local/bin/myba
export PATH="$HOME/.local/bin:$PATH"
myba help
Note, only one of openssl
or gpg
is needed, not both!
It should be similar, if not nearly equivalent, to install on other platforms. Hopefully you will find most dependencies already satisfied.
Please report back if you find / manage to get this working under everything but the above configuration and especially Windows/WSL!
You run the script with arguments according to the usage printout below.
Myba heavily relies on git
and thus its command-line usage largely follows that of git convention.
Most subcommands pass obtained arguments and options ("$@"
) straight to matching git
subcommands!
Usage: myba <subcommand> [options]
Subcommands:
init Initialize repos in $WORK_TREE (default: $HOME)
add [OPTS] PATH... Stage files for backup/version tracking
rm PATH... Stage-remove files from future backups/version control
commit [OPTS] Commit staged changes of tracked files as a snapshot
push [REMOTE] Encrypt and push files to remote repo(s) (default: all)
pull [REMOTE] Pull encrypted commits from a promisor remote
clone REPO_URL Clone an encrypted repo and init from it
remote CMD [OPTS] Manage remotes of the encrypted repo
decrypt [--squash] Reconstruct plain repo commits from encrypted commits
diff [OPTS] Compare changes between plain repo revisions
log [OPTS] Show commit log of the plain repo
checkout PATH... Sparse-checkout and decrypt files into $WORK_TREE
checkout COMMIT Switch files to a commit of plain or encrypted repo
gc Garbage collect, remove synced encrypted packs
git CMD [OPTS] Inspect/execute raw git commands inside plain repo
git_enc CMD [OPTS] Inspect/execute raw git commands inside encrypted repo
Env vars: WORK_TREE, PLAIN_REPO, PASSWORD, USE_GPG, VERBOSE, YES_OVERWRITE, ...
The script also acknowledges a few environment variables which you can set (or export) to steer program behavior:
WORK_TREE=
The root of the volume that contains important documents (such as dotfiles) to back up or restore to. If unspecified,$HOME
.PLAIN_REPO=
The internal directory where myba actually stores both its repositories. Defaults to$WORK_TREE/.myba
but can be overriden to somewhere out-of-tree ...PASSWORD=
The password to use for encryption instead of asking / reading from stdin.USE_GPG=
Myba usesopenssl enc
by default, but if you prefer to use GPG even for symmetric encryption, setUSE_GPG=1
.N_JOBS=
The number of parallel encryption/decryption processes at commit/checkout time. By default: 8.KDF_ITERS=
A sufficient number of iterations is used for the encryption key derivation function. To specify your own value and avoid rainbow table attacks on myba itself, you can customize this value. If you don't know, just leave it.YES_OVERWRITE=
If set, overwrite existing when restoring/checking out files that already exist in $WORK_TREE. The default is to ask instead.VERBOSE=
More verbose output about what the program is doing.
# Set volume root to user's $HOME and export for all further commands
export WORK_TREE="$HOME"
myba init
myba add Documents Photos Etc .dotfile
PASSWORD='secret' myba commit -m "my precious"
myba remote add origin "/media/usb/backup"
myba remote add github "git@github.com:user/my-backup.git"
VERBOSE=1 myba push # Push to all configured remotes & free up disk space
# Somewhere else, much, much later, avoiding catastrophe ...
export WORK_TREE="$HOME"
PASSWORD='secret' myba clone "..." # Clone one of the known remotes
myba checkout ".dotfile" # Restore backed up files in a space-efficient manner
See smoke-test.sh file for a more full example & test case!
The project is written for a POSIX shell and is hosted on GitHub.
The script is considered mostly feature-complete, but there remain bugs and design flaws to be discovered and ironed out, as well as any TODOs and FIXMEs marked in the source. All source code lines are open to discussion. Especially appreciated are targets for simplification and value-added testing.
Is git a good tool for backups?
The inherently core features of git/myba allow you to:
- track a list of important files,
- track all changes made, with authorship info and datums, to any of the tracked files,
- securely store copies of files at each commited snapshot,
- efficiently compress non-binary files,
- apply custom script filters to files based on file extension / glob string match,
- execute custom script hooks at various stages of program lifecycle.
Git is a stable and reliable tool used by millions of people and organizations worldwide, with long and rigorous release / support cycles.
Can git track file owner and permissions etc.?
Git doesn't on its own track file owner and permission changes (other than the executable bit). Files commited by any user are restorable by any user with the right password. In order to restore files with specific file permission bits set, defer to umask, e.g.:
umask 0077 # Restore files with `u=rwx,g=,o=`
WORK_TREE=~ myba checkout .ssh
If you need to restore file owners, file access times and similar metadata, simply write a small shell wrapper that takes care of it. You're encouraged to contrib anything short to the respect you find widely-applicable and useful.
Can we use git for continually changed databases and binary files?
Git saves whole file snapshots and doesn't do any in-file or within-file or across-file deduplication, so it's not well-suited to automatic continual backing up of databases (i.e. large binaries) that change often.
However, while git repositories bloat when commiting such large binary and media files, myba only ever uses sparse-checkout, keeping overhead disk space use to a minimum.
How to influence what files / filetypes to (ignore from) backup?
You stage files and directories for backup with version control as normally, with myba add
.
You can edit $PLAIN_REPO/info/exclude
, which is prepopulated with
default common ignore patterns.
Additionally by inheritance, myba
honors .gitignore files
for any directories that contain them.
You can tweak various other git settings (like
config,
filters,
hooks)
by modifying respective files in $PLAIN_REPO
and (encrypted repo) $PLAIN_REPO/_encrypted/.git
.
Encryption failed. How do I investigate / recover?
Myba constructs encrypted repo commits after successful plain repo commits.
Use myba git
and myba git_enc
subcommands to discover what state you're in (e.g. myba git status
).
Then use something like myba git reset HEAD^ ; myba git_enc reset HEAD
to reach an acceptable state.
If it looks like a bug, please report it. Otherwise git will let you know what the problem is.
Myba only deletes redundant encrypted blobs after successfully pushing to all configured remotes, and never deletes or overwrites existing files in work tree unless forced!