-
Notifications
You must be signed in to change notification settings - Fork 2
Extracting segmented zipfiles
Julian M. Morley edited this page Nov 30, 2018
·
2 revisions
Druid-versions > 10g are archived into zip files segmented at the 10g boundary, using this command:
zip -r0X -s 10g (destination file) druid/v0001
This results in several zip files with predictable names: druid.version.v0001.[z01..] and ending with druid.version.v0001.zip.
- Carefully concatenate the files, starting with .z01 and ending with .zip.
$ cat zy140tm9333.z01 > zy140tm9333.fixed.zip
$ cat zy140tm9333.z02 >> zy140tm9333.fixed.zip
$ cat zy140tm9333.z03 >> zy140tm9333.fixed.zip
$ cat zy140tm9333.z04 >> zy140tm9333.fixed.zip
$ cat zy140tm9333.z05 >> zy140tm9333.fixed.zip
$ cat zy140tm9333.zip >> zy140tm9333.fixed.zip
- Unzip it and enjoy the warnings
$ unzip zy140tm9333.fixed.zip
Archive: zy140tm9333.fixed.zip
warning [zy140tm9333.fixed.zip]: zipfile claims to be last disk of a multi-part archive;
attempting to process anyway, assuming all parts have been concatenated
together in order. Expect "errors" and warnings...true multi-part support
doesn't exist yet (coming soon).
warning [zy140tm9333.fixed.zip]: 104857600 extra bytes at beginning or within zipfile
(attempting to process anyway)
file #1: bad zipfile offset (local header sig): 104857604
(attempting to re-compensate)
Use p7zip.
# yum install p7zip
Then run 7za on the final .zip file in the archive set (no concatenation necessary!):
$ 7za x zy140tm9333.v0001.zip
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,8 CPUs Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz (306F0),ASM,AES-NI)
Scanning the drive for archives:
1 file, 4411211 bytes (4308 KiB)
Extracting archive: zy140tm9333.v0001.zip
--
Path = zy140tm9333.v0001.zip
Type = zip
Physical Size = 4411211
Embedded Stub Size = 4
Total Physical Size = 109268811
Multivolume = +
Volume Index = 5
Volumes = 6
Everything is Ok
Folders: 5
Files: 15
Size: 109265469
Compressed: 109268811
Everything really is OK.
- Replication errors
- Validate moab step fails during preservationIngestWF
- ZipmakerJob failures
- Moab Audit Failures
- Ceph Errors
- Job queues
- Deposit bag was missing
- ActiveRecord and Replication intro
- 2018 Work Cycle Documentation
- Fixing a stuck Moab
- Adding a new cloud provider
- Audits (how to run as needed)
- Extracting segmented zipfiles
- AWS credentials, S3 configuration
- Zip Creation
- Storage Migration Additional Information
- Useful ActiveRecord queries
- IO against Ceph backed preservation storage is hanging indefinitely (steps to address IO problems, and follow on cleanup)