You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When dealing with some legacy format files, I noticed that snzip will fail to read snappy-in-java format files that are concatenated together. The issue is when it encounters the 2nd file, it reads the 's' (0x73) from the header and aborts since its not a recognized id.
The simple workaround is to skip the next 6 bytes (nappy\0 ) similar to how the framing2 format implicitly skips the header (this is due to it reading 0xff 0x06 0x00 0x00 as 6, then skipping those 6 bytes (sNaPpY) with the fseek.
Before I sent a real PR I wanted to get some feedback. My quick and dirty workaround does not validate the 2nd header is actually a valid snappy header. However, framing2 doesn't do this either (it relies on the implicit skipping defined by the header format itself).
$ ./snzip -d -c three.snappy
hello
Unknown compressed flag 0x73
Patched:
$ ./snzip -d -c three.snappy
hello
world
Thoughts/preferences on patch approach?
Hacky version diff:
diff --git a/snappy-in-java-format.c b/snappy-in-java-format.c
index 0f95e1a..2b2579a 100644
--- a/snappy-in-java-format.c+++ b/snappy-in-java-format.c@@ -195,6 +195,16 @@ static int snappy_in_java_uncompress(FILE *infp, FILE *outfp, int skip_magic)
case UNCOMPRESSED_FLAG:
/* pass */
break;
+ case 's':+ /* s== 0x73 Possible concatenated block.+ * Note that other framing formats like frame2 see 0xff and just skip+ * the rest of the header due to the header being: 0xff 0x06 0x00 0x00 snappy+ * (it reads the 3-byte chunk header length resulting in a block length of+ * 6 bytes, and skips 6 bytes which happens to be == snappy)+ */+ /* Likely concatenated snappy file. We read first byte, skip rest */+ fseek(infp, SNAPPY_IN_JAVA_MAGIC_LEN - 1, SEEK_CUR); /* TODO strict check? */+ continue;
default:
print_error("Unknown compressed flag 0x%02x\n", compressed_flag);
goto cleanup;
The text was updated successfully, but these errors were encountered:
Thanks for opening the issue and sorry not to reply you for long time.
If you have will yet, could you make a pull request?
Could you validate file headers? That's because original implementation does. (here)
Could you fix indentation width also? This file uses two spaces for indentation.
When dealing with some legacy format files, I noticed that snzip will fail to read snappy-in-java format files that are concatenated together. The issue is when it encounters the 2nd file, it reads the 's' (
0x73
) from the header and aborts since its not a recognized id.The simple workaround is to skip the next 6 bytes (
nappy\0
) similar to how the framing2 format implicitly skips the header (this is due to it reading0xff 0x06 0x00 0x00
as 6, then skipping those 6 bytes (sNaPpY
) with the fseek.Before I sent a real PR I wanted to get some feedback. My quick and dirty workaround does not validate the 2nd header is actually a valid snappy header. However, framing2 doesn't do this either (it relies on the implicit skipping defined by the header format itself).
Creating test file:
Original version:
Patched:
Thoughts/preferences on patch approach?
Hacky version diff:
The text was updated successfully, but these errors were encountered: