Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with rsync option --delete #25

Open
blacktek opened this issue Aug 29, 2020 · 8 comments
Open

issue with rsync option --delete #25

blacktek opened this issue Aug 29, 2020 · 8 comments

Comments

@blacktek
Copy link

it seems that passing the --delete option to rsync produces inconsisten backups when using p >= 2

I used to backup a large nfs directory and the copied directory was 39GB instead of 42.

I think that the --delete option deletes files handled by other rsync processes. Is this the case?

@jbd
Copy link
Owner

jbd commented Aug 29, 2020

Hello,

you're right. You can have a look here: #5

I'm curious, msrsync should have detected the --delete option (https://github.com/jbd/msrsync/blob/master/msrsync#L587). Looks like a bug to me. If that's possible, could you provide the full command line you used ?

Thank you.

@blacktek
Copy link
Author

Hello,
I'm sorry; I looked at "Issues" before writing, but only looked at "open" Issues.

The problem is very clear, I guessed that the reason is the one you outlined

The parameter I provided to msrsync is --rsync "--delete" all the others are standard

Hopefully in my case the suggestion to first run msrsync without --delete and than a standard rsync with --delete is not so good. I've to syncronize an AWS EFS directory (basically NFS) with hundreds of thousands of small files. In this case there is a huge work with metadata and IO. Only checking the files would take as synking them.

I'm not sure that the issue is fixable with msrsync; perhaps you need to pass some ignore flag to avoid that it deletes files not matching the regexp. Just guessing :)

@jbd
Copy link
Owner

jbd commented Aug 29, 2020

Thank you for the feedback.

The parameter I provided to msrsync is --rsync "--delete" all the others are standard

What worries me is that you shouldn't be able to run msrsync with --delete:

$ ./msrsync --rsync "--delete" src dest
[...]
error: Cannot use --delete option type with msrsync. It would lead to disaster :)

Since you were able to run it without problem, this is a bug I'd like to correct. Could you provide the full --rsync options so that I can reproduce ?

@blacktek
Copy link
Author

Hello,
I've not found it in my history, but perhaps the terminal crashed at some point; perhaps I'm wrong; not 100% sure. I remember I tried that option.

But now I've realized that a folder copied to a new folder with msrsync has a different size than the same folder copied with rsync -a --delete:

[root@utils-node ~]# du -sh /backup/data2/
38G /backup/data2/
[root@utils-node ~]# du -sh /backup/data/
42G /backup/data/

Do you have any explanation?

Thank you

@jbd
Copy link
Owner

jbd commented Aug 29, 2020

But now I've realized that a folder copied to a new folder with msrsync has a different size than the same folder copied
with rsync -a --delete:

Weird. Did you spot any error message during the copy ? Are you able to spot where the differences are using something like ncdu or something else ?

@blacktek
Copy link
Author

Hi,
no error message; moreover I've repeated the task several times, and always with the same result.

After some time I was able to find the two directories with different content; are large xtrabackup backup files.

BUT... I've noticed that I was checking the directory size with "du".

Now I've noticed that taking one of the different files if I see the msrsynced file size with du I get 36K, while if I use ls -al I get 96K (like the file copied with rsync).
If I check with du the size copied with rsync I get the correct 96K.

Any guess. It's the same file of 98304 bytes, but the one copied with rsync reports 96K with du, while the one copied with msrsync reports 36K.

If I use the du flag --apparent-size they reports the same size. I think that the file copied with rsync suffers of some fragmentation, while with msrsync not

Could this be due to rsync applying incremental changes, so when I rsync to an existing directory it could create some holes in the files on the filesystem? I can guess only this :)

@jbd
Copy link
Owner

jbd commented Aug 30, 2020

I don't know.

  1. could you run sha1sum on those files just to be sure they are the same ?
  2. could you provide the output of stat your file ?
  3. are the rsync and msrsync destination folder hosted on the same filer ?

@blacktek
Copy link
Author

Hi,

  1. yes is the same
    4b1362831c1209e4ba666864a12afef4d5569985

  2. stat rsync

Size: 98304 Blocks: 192 IO Block: 4096 regular file
Device: 10301h/66305d Inode: 1181327 Links: 1
Access: (0640/-rw-r-----) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2020-08-30 18:59:31.088182098 +0200
Modify: 2020-08-30 18:12:06.895000000 +0200
Change: 2020-08-30 18:16:21.129866255 +0200
Birth: -

stat msrsync

Size: 98304 Blocks: 72 IO Block: 4096 regular file
Device: 10301h/66305d Inode: 2122670 Links: 1
Access: (0640/-rw-r-----) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2020-08-30 18:59:10.035036188 +0200
Modify: 2020-08-30 18:12:06.895000000 +0200
Change: 2020-08-30 18:27:57.087798998 +0200
Birth: -

  1. yes both source and destination are the same

as you can notice files have same size, but different amount of blocks.

Probably du without --apparent-size looks at blocks and not at real size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants