-
Notifications
You must be signed in to change notification settings - Fork 25
Real world XLS format
One of the fundamental reasons of zpaqfranz is to manage xls files: it may seem trivial, but it is not.
Excel in all versions (at least until 2019) has a particular behavior: when you open an xls file and then close it (even without any changes) it changes some bytes inside (metadata) WITHOUT touching the file.
So if the file is modified (let's say) on 17-07-2021 @ 17:47, it will remain modified (in the filesystem) on 17-07-2021 @ 17.47, even if its binary content has changed
17/07/2021 17:47 25.600 test.ok
17/07/2021 17:47 25.600 test.xls
Programs that rely on the date and time to decide whether to make a new copy (including robocopy, rsync, zpaq, zpaqfranz) cannot understand that the file has been changed (as in this example)
Z:\>c:\nz\sha1deep64 test.*
500f2a718a4bb1babf185e748ecef1febf83ae78 Z:\test.ok
393e24455d50ddd8c695d6ecdbfc0156dc2c0e6b Z:\test.xls
Trying a binary or hashed comparison, it turns out that the copied file does NOT correspond to the present one: the check FAILS.
This is extremely bad for a storage manager: copy verification can fail depending on whether the original XLS files are opened and closed without any changes!
With other programs (rsync, robocopy) a DOUBLE pass is required: the first to copy all the files "intelligently" (based on the modified date and time), the second "stupid" (copy ALL XLS files). This is obviously a big hassle, as well as slowing down considerably in the case of large amounts of data.
zpaqfranz, by default, always adds XLS files to archives, regardless of when they are modified (can be disabled by -xls).
The r (robocopy) command also works in particular with XLS files, which are carefully checked to make sure the copy is correct.