Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.2.0a6] KeyError and AssertionError on first backup, subsequent backups do not have errors #4577

Closed
manjaroblack opened this issue May 17, 2019 · 14 comments

Comments

@manjaroblack
Copy link

manjaroblack commented May 17, 2019

Have you checked borgbackup docs, FAQ, and open Github issues?

Yes

Is this a BUG / ISSUE report or a QUESTION?

BUG / ISSUE report and QUESTION

System information. For client/server mode post info for both machines.

Your borg version (borg -V).

1.2.0a6

Operating system (distribution) and version.

NAME="Manjaro Linux"
Release: 18.0.4
Kernel: Linux 5.1.0-1-MANJARO
Architecture: x86-64

Hardware / network configuration, and filesystems used.

elektrophagous              
    description: Convertible
    product: 80Y7 (LENOVO_MT_80Y7_BU_idea_FM_YOGA 920-13IKB)
    vendor: LENOVO
    version: Lenovo YOGA 920-13IKB
          product: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
          memory: 16GiB

How much data is handled by borg?

80GB

Full borg commandline that lead to the problem (leave away excludes and passwords)

borg create -v --stats --progress --compression zstd,22 --chunker-params 10,23,16,4095 $REPO::$HOST-$DATE /
--exclude /dev
--exclude /proc
--exclude /sys
--exclude /var/run
--exclude /run
--exclude /tmp
--exclude /lost+found
--exclude /mnt
--exclude /var/cache
--exclude /home/*/.cache
--exclude /home/ds/Downloads
--exclude /home/lost+found

Describe the problem you're observing.

On the first run of borg create after init, I get an error after about 4GB of deduplicated data. Following backups do not throw any errors.

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

Yes it is reproducible on my system. Happens every time on first backup after init.

Include any warning/errors/backtraces from the system logs

Creating archive at "/run/media/ds/bb::elektrophagous-2019-05-16-23:30"
Exception ignored in: <bound method Repository.__del__ of <Repository /run/media/ds/bb>>                                                                                                                         
Traceback (most recent call last):
  File "borg/repository.py", line 178, in __del__
  File "borg/repository.py", line 431, in close
  File "borg/locking.py", line 383, in release
  File "borg/locking.py", line 282, in modify
KeyError: (('localhost.localdomain@167275820287493', 31031, 0),)
Local Exception
Traceback (most recent call last):
  File "borg/archiver.py", line 574, in do_create
  File "borg/archiver.py", line 527, in create_inner
  File "borg/archiver.py", line 657, in _process
  File "borg/archiver.py", line 657, in _process
  File "borg/archiver.py", line 657, in _process
  File "borg/archiver.py", line 622, in _process
  File "borg/archive.py", line 1229, in process_file
  File "borg/archive.py", line 1085, in process_file_chunks
  File "borg/archive.py", line 1063, in write_part_file
  File "borg/archive.py", line 504, in write_checkpoint
  File "borg/cache.py", line 937, in chunk_decref
  File "borg/repository.py", line 1168, in delete
  File "borg/repository.py", line 498, in prepare_txn
AssertionError: bug in code, exclusive lock should exist here

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "borg/archiver.py", line 164, in wrapper
  File "borg/archiver.py", line 574, in do_create
  File "borg/cache.py", line 488, in __exit__
  File "borg/cache.py", line 520, in close
  File "borg/cache.py", line 312, in close
  File "borg/locking.py", line 383, in release
  File "borg/locking.py", line 282, in modify
KeyError: ('localhost.localdomain@167275820287493', 31031, 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "borg/archiver.py", line 4395, in main
  File "borg/archiver.py", line 4327, in run
  File "borg/archiver.py", line 164, in wrapper
  File "borg/repository.py", line 203, in __exit__
  File "borg/repository.py", line 431, in close
  File "borg/locking.py", line 383, in release
  File "borg/locking.py", line 282, in modify
KeyError: ('localhost.localdomain@167275820287493', 31031, 0)

Platform: Linux elektrophagous 5.1.0-1-MANJARO #1 SMP PREEMPT Mon Mar 18 21:43:04 UTC 2019 x86_64
Linux: arch Manjaro Linux
Borg: 1.2.0a6  Python: CPython 3.6.8 msgpack: 0.6.1
PID: 31031  CWD: /home/ds
sys.argv: ['borg', 'create', '-v', '--stats', '--progress', '--compression', 'zstd,22', '--chunker-params', '10,23,16,4095', '/run/media/ds/bb::elektrophagous-2019-05-16-23:30', '/', '--exclude', '/dev', '--exclude', '/proc', '--exclude', '/sys', '--exclude', '/var/run', '--exclude', '/run', '--exclude', '/tmp', '--exclude', '/lost+found', '--exclude', '/mnt', '--exclude', '/var/cache', '--exclude', '/home/ds/.cache', '--exclude', '/home/ds/Downloads', '--exclude', '/home/lost+found']
SSH_ORIGINAL_COMMAND: None
@ThomasWaldmann
Copy link
Member

/run/media/ds/bb on what kind of filesystem is that directory located?

Having that kind of locking problems on a freshly initialized repo seems strange.

@manjaroblack
Copy link
Author

Both the computer and backup drive are using BTRFS.

@ThomasWaldmann
Copy link
Member

Can you use a device formatted with ext4 and try to reproduce?

@manjaroblack
Copy link
Author

I just tried it on a few different file systems, ext4 included, seems that the error is the same. It seems to be backing up a windows .img file when it fails.

@ThomasWaldmann
Copy link
Member

OK, just wanted to make sure we do not debug btrfs issues here.

I'll try to reproduce the issue locally.

@ThomasWaldmann
Copy link
Member

Can you try to reduce the borg commands to a minimum (regarding options, excludes, etc), so you can post the minimum of commands needed that still reproduces the issue?

What exactly is a "windows .img" file? A picture? A (virtual machine) disk image? Something else?

@ThomasWaldmann ThomasWaldmann added this to the hydrogen-alpha7 milestone May 20, 2019
@manjaroblack
Copy link
Author

When reducing the commands to a minimum which reproduces the error I came up with:

Produces error:
sudo borg create --compression zstd,22 /run/media/ds/bb::test ~/windows10.img

Does not produce error:
sudo borg create /run/media/ds/bb::test ~/windows10.img

Other findings:
Running the original command with the addition of --exclude ~/windows10.img does produce an error:

Creating archive at "/run/media/ds/bb/::test"
Exception ignored in: <bound method Repository.__del__ of <Repository /run/media/ds/bb>>/League of Legends/Game/DATA/FINAL/Champions/Kaisa.wad.client                                                            
Traceback (most recent call last):
  File "borg/repository.py", line 178, in __del__
  File "borg/repository.py", line 431, in close
  File "borg/locking.py", line 383, in release
  File "borg/locking.py", line 282, in modify
KeyError: (('localhost.localdomain@167275820287493', 16190, 0),)
Local Exception
Traceback (most recent call last):
  File "borg/archiver.py", line 574, in do_create
  File "borg/archiver.py", line 527, in create_inner
  File "borg/archiver.py", line 657, in _process
  File "borg/archiver.py", line 657, in _process
  File "borg/archiver.py", line 657, in _process
  [Previous line repeated 9 more times]
  File "borg/archiver.py", line 622, in _process
  File "borg/archive.py", line 1229, in process_file
  File "borg/archive.py", line 1081, in process_file_chunks
  File "borg/archive.py", line 1069, in chunk_processor
  File "borg/cache.py", line 908, in add_chunk
  File "borg/crypto/key.py", line 371, in encrypt
  File "borg/crypto/nonces.py", line 79, in ensure_reservation
  File "borg/crypto/nonces.py", line 38, in get_repo_free_nonce
  File "borg/repository.py", line 318, in get_free_nonce
AssertionError: bug in code, exclusive lock should exist here

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "borg/archiver.py", line 164, in wrapper
  File "borg/archiver.py", line 574, in do_create
  File "borg/cache.py", line 488, in __exit__
  File "borg/cache.py", line 520, in close
  File "borg/cache.py", line 312, in close
  File "borg/locking.py", line 383, in release
  File "borg/locking.py", line 282, in modify
KeyError: ('localhost.localdomain@167275820287493', 16190, 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "borg/archiver.py", line 4395, in main
  File "borg/archiver.py", line 4327, in run
  File "borg/archiver.py", line 164, in wrapper
  File "borg/repository.py", line 203, in __exit__
  File "borg/repository.py", line 431, in close
  File "borg/locking.py", line 383, in release
  File "borg/locking.py", line 282, in modify
KeyError: ('localhost.localdomain@167275820287493', 16190, 0)

Platform: Linux elektrophagous 5.1.1-2-MANJARO #1 SMP PREEMPT Sat May 11 12:12:50 UTC 2019 x86_64
Linux: arch Manjaro Linux
Borg: 1.2.0a6  Python: CPython 3.6.8 msgpack: 0.6.1
PID: 16190  CWD: /home/ds
sys.argv: ['borg', 'create', '-v', '--stats', '--progress', '--compression', 'zstd,22', '--chunker-params', '10,23,16,4095', '/run/media/ds/bb/::test', '/', '--exclude', '/dev', '--exclude', '/proc', '--exclude', '/sys', '--exclude', '/var/run', '--exclude', '/run', '--exclude', '/tmp', '--exclude', '/lost+found', '--exclude', '/mnt', '--exclude', '/var/cache', '--exclude', '/home/ds/.cache', '--exclude', '/home/ds/Downloads', '--exclude', '/home/lost+found', '--exclude', '/usr/lib', '--exclude', '/usr/share', '--exclude', '/home/ds/windows10.img']
SSH_ORIGINAL_COMMAND: None

@manjaroblack
Copy link
Author

I did a couple more tests. In the first I ran it with lzma,9. The second, default. The first generated the errors and the second did not. The drive was formatted with xfs in the first and fat32 in the second.

┬─[ds@elektrophagous:~]─[11:34:51 PM]
╰─>$ sudo borg create -v --stats --progress --compression auto,lzma,9 --chunker-params 10,23,16,4095 /run/media/ds/bb/::test /     \
             --exclude /dev                                                                                          \
             --exclude /proc                                                                                         \
             --exclude /sys                                                                                          \
             --exclude /var/run                                                                                      \
             --exclude /run                                                                                          \
             --exclude /tmp                                                                                          \
             --exclude /lost+found                                                                                   \
             --exclude /mnt                                                                                          \
             --exclude /var/cache                                                                                    \
             --exclude /home/*/.cache                                                                                \
             --exclude /home/ds/Downloads                                                                            \
             --exclude /home/lost+found                                                                              \
             --exclude /usr/lib                                                                                      \
             --exclude /usr/share --exclude ~/windows10.img
Creating archive at "/run/media/ds/bb/::test"
Exception ignored in: <bound method Repository.__del__ of <Repository /run/media/ds/bb>>/League of Legends/Game/DATA/FINAL/Champions/Karthus.en_US.wad.client                                                    
Traceback (most recent call last):
  File "borg/repository.py", line 178, in __del__
  File "borg/repository.py", line 431, in close
  File "borg/locking.py", line 383, in release
  File "borg/locking.py", line 282, in modify
KeyError: (('localhost.localdomain@167275820287493', 14174, 0),)
Local Exception
Traceback (most recent call last):
  File "borg/archiver.py", line 574, in do_create
  File "borg/archiver.py", line 527, in create_inner
  File "borg/archiver.py", line 657, in _process
  File "borg/archiver.py", line 657, in _process
  File "borg/archiver.py", line 657, in _process
  [Previous line repeated 9 more times]
  File "borg/archiver.py", line 622, in _process
  File "borg/archive.py", line 1229, in process_file
  File "borg/archive.py", line 1081, in process_file_chunks
  File "borg/archive.py", line 1069, in chunk_processor
  File "borg/cache.py", line 908, in add_chunk
  File "borg/crypto/key.py", line 371, in encrypt
  File "borg/crypto/nonces.py", line 79, in ensure_reservation
  File "borg/crypto/nonces.py", line 38, in get_repo_free_nonce
  File "borg/repository.py", line 318, in get_free_nonce
AssertionError: bug in code, exclusive lock should exist here

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "borg/archiver.py", line 164, in wrapper
  File "borg/archiver.py", line 574, in do_create
  File "borg/cache.py", line 488, in __exit__
  File "borg/cache.py", line 520, in close
  File "borg/cache.py", line 312, in close
  File "borg/locking.py", line 383, in release
  File "borg/locking.py", line 282, in modify
KeyError: ('localhost.localdomain@167275820287493', 14174, 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "borg/archiver.py", line 4395, in main
  File "borg/archiver.py", line 4327, in run
  File "borg/archiver.py", line 164, in wrapper
  File "borg/repository.py", line 203, in __exit__
  File "borg/repository.py", line 431, in close
  File "borg/locking.py", line 383, in release
  File "borg/locking.py", line 282, in modify
KeyError: ('localhost.localdomain@167275820287493', 14174, 0)

Platform: Linux elektrophagous 5.1.1-2-MANJARO #1 SMP PREEMPT Sat May 11 12:12:50 UTC 2019 x86_64
Linux: arch Manjaro Linux
Borg: 1.2.0a6  Python: CPython 3.6.8 msgpack: 0.6.1
PID: 14174  CWD: /home/ds
sys.argv: ['borg', 'create', '-v', '--stats', '--progress', '--compression', 'auto,lzma,9', '--chunker-params', '10,23,16,4095', '/run/media/ds/bb/::test', '/', '--exclude', '/dev', '--exclude', '/proc', '--exclude', '/sys', '--exclude', '/var/run', '--exclude', '/run', '--exclude', '/tmp', '--exclude', '/lost+found', '--exclude', '/mnt', '--exclude', '/var/cache', '--exclude', '/home/ds/.cache', '--exclude', '/home/ds/Downloads', '--exclude', '/home/lost+found', '--exclude', '/usr/lib', '--exclude', '/usr/share', '--exclude', '/home/ds/windows10.img']
SSH_ORIGINAL_COMMAND: None

┬─[ds@elektrophagous:~]─[07:42:03 AM]
╰─>$ sudo borg init --encryption=none /run/media/ds/1148-A56D/
[sudo] password for ds: 
┬─[ds@elektrophagous:~]─[07:42:42 AM]
╰─>$ sudo borg create -v --stats --progress --compression auto,lzma --chunker-params 10,23,16,4095 /run/media/ds/1148-A56D/::test /     \
             --exclude /dev                                                                                          \
             --exclude /proc                                                                                         \
             --exclude /sys                                                                                          \
             --exclude /var/run                                                                                      \
             --exclude /run                                                                                          \
             --exclude /tmp                                                                                          \
             --exclude /lost+found                                                                                   \
             --exclude /mnt                                                                                          \
             --exclude /var/cache                                                                                    \
             --exclude /home/*/.cache                                                                                \
             --exclude /home/ds/Downloads                                                                            \
             --exclude /home/lost+found                                                                              \
             --exclude /usr/lib                                                                                      \
             --exclude /usr/share --exclude ~/windows10.img
Creating archive at "/run/media/ds/1148-A56D/::test"
------------------------------------------------------------------------------                                                                                                                                   
Archive name: test
Archive fingerprint: d24e9b469c6e81d77030c9f65c6052616fb99f6db99bdc69af1c4b4fb7ed319b
Time (start): Tue, 2019-05-21 07:43:15
Time (end):   Tue, 2019-05-21 08:00:33
Duration: 17 minutes 17.65 seconds
Number of files: 57636
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               13.45 GB              9.76 GB              8.59 GB
All archives:               13.45 GB              9.76 GB              8.59 GB

                       Unique chunks         Total chunks
Chunk index:                  203800               245956
------------------------------------------------------------------------------

@ThomasWaldmann
Copy link
Member

It doesn't affect your minimalistic reproductions, but some comments about your original command:

borg create -v --stats --progress --compression zstd,22 --chunker-params 10,23,16,4095 $REPO::$HOST-$DATE /
--exclude /dev
--exclude /proc
--exclude /sys
--exclude /var/run
--exclude /run
--exclude /tmp
--exclude /lost+found
--exclude /mnt
--exclude /var/cache
--exclude /home/*/.cache
--exclude /home/ds/Downloads
--exclude /home/lost+found
  • you need \ at the line ends to continue one line to the next one
  • '/home/*/.cache' - you need to single quote stuff with wildcards if you want to give them to borg
  • $REPO::$HOST-$DATE - borg has placeholders for host and timestamp and BORG_REPO env var
  • --chunker-params 10,23,16,4095 - tiny chunks, don't use that for big repos

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented May 22, 2019

Also, I'ld appreciate if you use minimalistic commands in further research. It doesn't help if there are a ton of options / arguments if they are not needed to reproduce the problem.

Preferably also do not use wildcards, shell variables and sudo, it just complicates matters and we are searching bugs in borg here, not in scripts.

@ThomasWaldmann
Copy link
Member

I read localhost.localdomain in the tracebacks - is that really your fqdn?

The locking system relies on unique keys and it makes them from fqdn and MAC address.

Do you have other hosts with same localhost.localdomain and MAC address accessing the same repo?

@manjaroblack
Copy link
Author

Thank you for your suggestions. I will make sure to follow them. Yeah that is my fqdn, this is the only computer using borg at the moment, no other computers with the same MAC.

@ThomasWaldmann
Copy link
Member

Any news? If you can give a minimal script, that reproduces the problem on your machine (and also hopefully on mine), that would help a lot.

@ThomasWaldmann
Copy link
Member

No feedback...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants