Skip to content

rtmigo/dmk_py

Repository files navigation

Generic badge Generic badge Generic badge

THIS IS EXPERIMENTAL CODE. THE FILE FORMAT MAY CHANGE

dmk stores files, passwords or other private data in an encrypted vault file.

Each entry has a secret name, that decrypts the entry. It reveals nothing about other entries, even whether they exist.

No master password. No table of contents. No way to determine the number entries. No way to access all entries at once.

The vault file is mostly unidentifiable data. Secret name discovers the data of particular entry. The rest of the data remain dark matter.

Install

$ pip3 install dmk

Secret names

A secret name serves as both:

  • the name of the entry
  • the password

It is a secret. And it must be unique.

For example, information about a credit card credentials can be stored under name "crEd1tcard" or "visa_secret123".

Longer secret names mean better encryption.

Save and read text

When called without parameters, the get and set commands query for all values interactively:

$ dmk set

Secret name: secRet007
Repeat secret name: secRet007 
Text: My darling's jokes are not so funny
$ dmk get

Secret name: secRet007
 
My darling's jokes are not so funny

Interactive input is optional. You can get by with one line:

$ dmk set -e secRet007 -t "My darling's jokes are not so funny"
$ dmk get -e secRet007

My darling's jokes are not so funny

Save and read file

Read data from a source.doc and save it as encrypted entry secRet007

$ dmk set -e secRet007 /my/docs/source.doc

Decrypt the entry secRet007 and write the result to target.doc

$ dmk get -e secRet007 /my/docs/target.doc

The -e parameter is optional. If it is not specified, the value will be prompted for interactive input.

Add dummy data

Part of the vault file contains dummy data. This data cannot be decrypted. Dummy data only increases the size of the storage, thus hiding the amount of real data.

Each time the file is updated, a random amount of dummy data is added and removed. The change can be up to 5% of the file size.

You can also add dummy data manually, to make sure the file is big enough.

Make the vault file 2 megabytes larger:

dmk dummy 2M

Make the vault file 500 kilobytes larger:

dmk dummy 500K

Keep in mind:

  • Dummy data added in this way cannot be removed
  • Vault speed linearly depends on its size. If you increase the vault 10 times, then the search for data in it will go 10 times slower

Vault location

Entries will be stored in a file.

You can check the current vault file location with vault command:

$ dmk vault

Output:

/home/username/vault.dmk

By default, it is vault.dmk in the current user's $HOME directory.


The -v parameter overrides the location for a single run.

$ dmk -v /path/to/myfile.data vault

Output:

/path/to/myfile.data

The parameter can be used with any commands:

$ dmk -v /path/to/myfile.data set 
$ dmk -v /path/to/myfile.data get 

The $DMK_VAULT_FILE environment variable overrides the default location:

$ export DMK_VAULT_FILE=/path/to/myfile.data
$ dmk vault  

Output:

/path/to/myfile.data

While $DMK_VAULT_FILE is set all the commands will use myfile.data:

$ dmk set   # set to myfile.data 
$ dmk get   # get from myfile.data

Under the hood

  • Entries are encrypted
  • Number of entries cannot be determined
  • File format is unidentifiable

Size obfuscation

The vault file stores all data within multiple fixed-size blocks.

Small entries are padded so they become block-sized. Large entries are split and padded to fit into multiple blocks. In the end, they are all just a lot of blocks.

A block gives absolutely no information for someone who does not own the secret name. All non-random data is either hashed or encrypted. The size of padding is unknown.

The number of blocks is no secret. Their contents are secret.

  • The number of blocks is random. Many blocks are dummy. They are indistinguishable from real data, but do not contain anything meaningful

  • The information about which entry the block belongs to is cryptographically protected. It is impossible to even figure out if two blocks belong to the same entry

  • Random actions are taken every time the vault is updated: some dummy blocks are added, and some are removed

Thus, number and size of entries cannot be determined by the size of the vault file or number of blocks.

Only the following is known:

  • The payload is smaller than the file size
  • The number of entries is less than the number of blocks

By the way, the file may contain zero entries.

File obfuscation

The vault file format is indistinguishable from random data.

The file has no signatures, no header, no constant bytes (or even bits), no block boundaries. File size will not give clues: the file is randomly padded with a size that is not a multiple of a block.

The only predictable part of the file is a version identifier encoded in the first two bytes. But the similar "version number" can be found literally in every fourth file in the world. Those two bytes are not even constant.

Еncryption

  1. URandom creates 38-bytes salt when we initialize the vault file. The salt is saved openly in the file. This salt never changes. It is required for any other actions on the vault.

  2. Argon2id (memory 128 MiB, iterations 4, parallelism 8) derives 256-bit private key from salted (1) secret name.

  3. 96-bit urandom block nonce is generated for each block.

  4. To indicate that a block belongs to the secret name, we add a 256-bit hash to the beginning of the block. It is a Blake2s hash of private key (2) + block nonce (3).

    During the read, for each block, we compute this hash again. If the value matches, we decide that the block belongs to the secret name.

  5. ChaCha20 encrypts the block data using the 256-bit private key (2) and 96-bit block nonce (3).

  6. CRC-32 checksum verifies the entry data decrypted from the block.

    This verification occurs when we have already beleive (4) that the private key is correct. Therefore, it is really only a self-test to see if the data is decoded as expected.

    This checksum is saved inside the encrypted stream. If the data in two
    blocks are the same, it will not be noticeable from the outside due to different nonce (3) values.