Java has some embedded features to parse binary data (for instance ByteBuffer), but sometime it is needed to work on bit
level and describe binary structures through some DSL(domain specific language). I was impressed by
the the Python Struct package package and wanted to get something like
that for Java. So I developed the JBBP library.
-
3.0.1 (24-dec-2024)
- added
MSB0_DIRECT
bit order mode, MSB0 without data revers #46 - added
JBBPBitInputStream#isDetectedPartlyReadBitField
to check that only part of bit field read during last operation - added flag into constructors for JBBPBitInputStream to force return -1 instead of partly accumulated bits data if end of field
- added
-
3.0.0 (16-nov-2024)
- Minimum JDK Version: Updated to 11.0.
- Minimum Supported Android: Updated to 12 (API 32).
- API Changes: Modifications made to the CompiledBlockVisitor API.
- New Feature: Added
JBBPUtils#findMaxStaticArraySize
for calculating the largest static array size defined in a JBBP script. - Internal API: Certain internal APIs have been opened.
- Codebase Improvements: General refactoring performed.
-
2.1.0 (05-nov-2024)
The Framework has been published in the Maven Central and can be easily added as a dependency
<dependency>
<groupId>com.igormaznitsa</groupId>
<artifactId>jbbp</artifactId>
<version>3.0.1</version>
</dependency>
the precompiled library jar, javadoc and sources also can be downloaded directly from the Maven central.
The library is very easy in use because in many cases only two its classes are needed - com.igormaznitsa.jbbp.JBBPParser (for data parsing) and com.igormaznitsa.jbbp.io.JBBPOut (for binary block writing). Both these classes work over low-level IO classes - com.igormaznitsa.jbbp.io.JBBPBitInputStream and com.igormaznitsa.jbbp.io.JBBPBitOutputStream, those bit stream classes are the core of the library.
The easiet use case shows parsing of whole byte array to bits.
byte[]parsedBits=JBBPParser.prepare("bit:1 [_];").parse(new byte[]{1,2,3,4,5}).
findFieldForType(JBBPFieldArrayBit.class).getArray();
On start it was the only functionality but then I found that it is no so comfort way to get result, so that added some mapping of parsed result to pre-instantiated object. It works slower, because uses a lot of Java reflection but much easy in some cases.
class Parsed {
@Bin(type = BinType.BIT_ARRAY)
byte[] parsed;
}
Parsed parsedBits = JBBPParser.prepare("bit:1 [_] parsed;").parse(new byte[] {1, 2, 3, 4, 5}).mapTo(new Parsed());
Mainly I developed the library to help in my development of ZX-Spectrum emulator where I needed to work with data
snapshots containing data on bit level. It didn't need much productivity in work. But since 1.3.0 version I added way to
generate Java classes from JBBP scripts, such classes work in about five times faster than dynamic parsing and mapping
approaches.
Chart below compares speed of three provided ways to parse data with JBBP:
- Dynamic - the basic parsing through interpretation of prepared JBBP DSL script. It is no so fast, but provide way to generate parsers on fly from text description.
- Dynamic + map to class - parsing through interpretation of parsed JBBP script and mapping of parsed data to pre-instantiated class instance. It provides compfortable way to work with data and get result but uses a lot of Java reflection features and so fast.
- Static class - the fastest way of JBBP use, some JBBP script is translated into Java class. There is no any interpretation or reflection operators so that it is very fast. You can take a look at auxiliary class which I use in tests .
Since 1.3.0 version, the library provides Java source generator for JBBP scripts, (keep in mind that generated sources anyway depends on JBBP library and it is needed for their work). For instance such snippet can be used to generate Java classes from a JBBP script. It also can generate multiple classes.
JBBPParser parser=JBBPParser.prepare("byte a; byte b; byte c;");
List<ResultSrcItem> generated=parser.convertToSrc(TargetSources.JAVA,"com.test.jbbp.gen.SomeClazz");
for(ResultSrcItem i:generated){
for(Map.Entry<String, String> j:i.getResult().entrySet()) {
System.out.println("Class file name "+j.getKey());
System.out.println("Class file content "+j.getValue());
}
}
also there are developed plug-ins for both Maven and Gradle to generate sources from JBBP scripts during source generate
phase.
in Maven it can be used through snippet:
<plugin>
<groupId>com.igormaznitsa</groupId>
<artifactId>jbbp-maven-plugin</artifactId>
<version>3.0.1</version>
<executions>
<execution>
<id>gen-jbbp-src</id>
<goals>
<goal>generate</goal>
</goals>
</execution>
</executions>
</plugin>
By default the maven plug-in looks for files with jbbp
extension in src/jbbp
folder of the project (it can be
changed through plug-in configuration) and produces resulting java classes into target/generated-sources/jbbp
folder. For instance, I use such approach in my ZX-Poly emulator
.
Example below shows how to parse a byte stream written in non-standard MSB0 order (Java has LSB0 bit order) into bit fields, then print its values and pack fields back:
class Flags {
@Bin(order = 1, name = "f1", type = BinType.BIT, bitNumber = JBBPBitNumber.BITS_1, comment = "It's flag one")
byte flag1;
@Bin(order = 2, name = "f2", type = BinType.BIT, bitNumber = JBBPBitNumber.BITS_2, comment = "It's second flag")
byte flag2;
@Bin(order = 3, name = "f3", type = BinType.BIT, bitNumber = JBBPBitNumber.BITS_1, comment = "It's 3th flag")
byte flag3;
@Bin(order = 4, name = "f4", type = BinType.BIT, bitNumber = JBBPBitNumber.BITS_4, comment = "It's 4th flag")
byte flag4;
}
final int data = 0b10101010;
Flags parsed = JBBPParser.prepare("bit:1 f1; bit:2 f2; bit:1 f3; bit:4 f4;", JBBPBitOrder.MSB0).parse(new byte[]{(byte)data}).mapTo(new Flags());
assertEquals(1, parsed.flag1);
assertEquals(2, parsed.flag2);
assertEquals(0, parsed.flag3);
assertEquals(5, parsed.flag4);
System.out.println(new JBBPTextWriter().Bin(parsed).Close().toString());
assertEquals(data, JBBPOut.BeginBin(JBBPBitOrder.MSB0).Bin(parsed).End().toByteArray()[0] & 0xFF);
The Example will print in console the text below
;--------------------------------------------------------------------------------
; START : Flags
;--------------------------------------------------------------------------------
01; f1, It's flag one
02; f2, It's second flag
00; f3, It's 3th flag
05; f4, It's 4th flag
;--------------------------------------------------------------------------------
; END : Flags
;--------------------------------------------------------------------------------
Each field can have case insensitive name which must not contain '.' (because dot is reserved for links to structure field values) and '#'(because it is also reserved for internal library use). A field name must not be started with either number or chars '$' and '_'. Keep in mind that field names are case insensitive!
int someNamedField;
byte field1;
byte field2;
byte field3;
JBBP supports full set of Java numeric primitives with some extra types like ubyte and bit.
JBBP provides support both arrays and structures. In expressions you can use links only to field values which already read!
It is possible to define processors for custom data types. For instance you can take a look at case processing three byte unsigned integer types .
Since 1.4.0 in JBBP was added support of Java float, double and String values. Because they have specific format, they
are named as doublej
, floatj
and stringj
.
If you have some data which internal structure is undefined and variable then you can use the var
type to mark such
field and provide custom processor to read data of such value. Processor should implement
interface JBBPVarFieldProcessor
instance.
final JBBPParser parser = JBBPParser.prepare("short k; var; int;");
final JBBPIntCounter counter = new JBBPIntCounter();
final JBBPFieldStruct struct = parser.parse(new byte[]{9, 8, 33, 1, 2, 3, 4}, new JBBPVarFieldProcessor() {
public JBBPAbstractArrayField<? extends JBBPAbstractField> readVarArray(JBBPBitInputStream inStream, int arraySize, JBBPNamedFieldInfo fieldName, int extraValue, JBBPByteOrder byteOrder, JBBPNamedNumericFieldMap numericFieldMap) throws IOException {
fail("Must not be called");
return null;
}
public JBBPAbstractField readVarField(JBBPBitInputStream inStream, JBBPNamedFieldInfo fieldName, int extraValue, JBBPByteOrder byteOrder, JBBPNamedNumericFieldMap numericFieldMap) throws IOException {
final int value = inStream.readByte();
return new JBBPFieldByte(fieldName, (byte) value);
}
}, null);
NB! Some programmers trying to use only parser for complex data, it is a mistake. In the case it is much better to have several easy parsers working with the same JBBPBitInputStream instance, it allows to keep decision points on Java level and make solution easier.
Special types makes some actions to skip data in input stream
Multi-byte types can be read with different byte order.
Expressions are used for calculation of length of arrays and allow brackets and integer operators which work similar to Java operators:
- Arithmetic operators: +,-,%,*,/,%
- Bit operators: &,|,^,~
- Shift operators: <<,>>,>>>
- Brackets: (, )
Inside expression you can use integer numbers and named field values through their names (if you use fields from the same structure) or paths. Keep in your mind that you can't use array fields or fields placed inside structure arrays.
int field1;
struct1 {
int field2;
}
byte [field1+struct1.field2] data;
You can use commentaries inside a parser script, the parser supports the only comment format and recognizes as commentaries all text after '//' till the end of line.
int;
// hello commentaries
byte field;
Inside expression you can use field names and field paths, also you can use the special macros '$$' which represents the current input stream byte counter, all fields started with '$' will be recognized by the parser as special user defined variables and it will be requesting them from special user defined provider. If the array size contains the only '_' symbol then the field or structure will not have defined size and whole stream will be read.
The Result of parsing is an instance of com.igormaznitsa.jbbp.model.JBBPFieldStruct class which represents the root invisible structure for the parsed data and you can use its inside methods to find desired fields for their names, paths or classes. All Fields are successors of com.igormaznitsa.jbbp.model.JBBPAbstractField class. To increase comfort, it is easier to use mapping to classes when the mapper automatically places values to fields of a Java class.
Example below shows how to parse a PNG file through JBBP parser:
final InputStream pngStream = getResourceAsInputStream("picture.png");
try {
final JBBPParser pngParser = JBBPParser.prepare(
"long header;"
+ "// chunks\n"
+ "chunk [_]{"
+ " int length; "
+ " int type; "
+ " byte[length] data; "
+ " int crc;"
+ "}"
);
JBBPFieldStruct result = pngParser.parse(pngStream);
assertEquals(0x89504E470D0A1A0AL,result.findFieldForNameAndType("header",JBBPFieldLong.class).getAsLong());
JBBPFieldArrayStruct chunks = result.findFieldForNameAndType("chunk", JBBPFieldArrayStruct.class);
String [] chunkNames = new String[]{"IHDR","gAMA","bKGD","pHYs","tIME","tEXt","IDAT","IEND"};
int [] chunkSizes = new int[]{0x0D, 0x04, 0x06, 0x09, 0x07, 0x19, 0x0E5F, 0x00};
assertEquals(chunkNames.length,chunks.size());
for(int i=0;i<chunks.size();i++){
assertChunk(chunkNames[i], chunkSizes[i], (JBBPFieldStruct)chunks.getElementAt(i));
}
}
finally {
closeResource(pngStream);
}
Also it is possible to map parsed packet to class fields
final JBBPParser pngParser = JBBPParser.prepare(
"long header;"
+ "chunk [_]{"
+ " int length; "
+ " int type; "
+ " byte[length] data; "
+ " int crc;"
+ "}"
);
class Chunk {
@Bin int length;
@Bin int type;
@Bin byte [] data;
@Bin int crc;
}
@Bin
class Png {
long header;
Chunk [] chunk;
public Object newInstance(Class<?> klazz){
return klazz == Chunk.class ? new Chunk() : null;
}
}
final Png png = pngParser.parse(pngStream).mapTo(new Png());
Example shows how to parse TCP frame:
final JBBPParser tcpParser = JBBPParser.prepare(
"skip:34; // skip bytes till the frame\n"
+ "ushort SourcePort;"
+ "ushort DestinationPort;"
+ "int SequenceNumber;"
+ "int AcknowledgementNumber;"
+ "bit:1 NONCE;"
+ "bit:3 RESERVED;"
+ "bit:4 HLEN;"
+ "bit:1 FIN;"
+ "bit:1 SYN;"
+ "bit:1 RST;"
+ "bit:1 PSH;"
+ "bit:1 ACK;"
+ "bit:1 URG;"
+ "bit:1 ECNECHO;"
+ "bit:1 CWR;"
+ "ushort WindowSize;"
+ "ushort TCPCheckSum;"
+ "ushort UrgentPointer;"
+ "byte [$$-34-HLEN*4] Option;"
+ "byte [_] Data;"
);
final JBBPFieldStruct result = pngParser.parse(tcpFrameStream);
@Bin
annotations is used only for mapping and data writing, but there is special
class JBBPDslBuilder which can convert @Bin
marked class into JBBP script, for instance:
JBBPDslBuilder.Begin().AnnotatedClass(SomeBinAnnotatetClass.class).End(true);
No problems! JBBP parser works over com.igormaznitsa.jbbp.io.JBBPBitInputStream class which can be used directly and allows read bits, bytes, count bytes and align data. For writing there is similar class JBBPBitOutputStream .
Library provides special helper JBBPOut. The helper allows to generate binary blocks and provides some kind of DSL
import static com.igormaznitsa.jbbp.io.JBBPOut.*;
...
final byte [] array =
BeginBin().
Bit(1, 2, 3, 0).
Bit(true, false, true).
Align().
Byte(5).
Short(1, 2, 3, 4, 5).
Bool(true, false, true, true).
Int(0xABCDEF23, 0xCAFEBABE).
Long(0x123456789ABCDEF1L, 0x212356239091AB32L).
End().toByteArray();