24 November 2022
On Aug 3, 2022 efforts began to implement mqtt-v5.0 in Go, resulting in packages gregoryv/mq and gregoryv/tt. This article documents efforts of writing it and design decisions taken along the way.
Background §
In a world of connected devices or internet of things (IoT), the MQTT protocol is used for device to cloud communication. Its compact size makes it effective for asynchronous telemetry messaging.
In one project difficulties where encountered and I needed to learn more about the protocol details. While digging into the specification I realized there where to many abstractions on top of the wire format and I needed a way to simplify things. The wire format is concrete though the behavior of clients and servers contains optional capabilities. As such, maybe writing your own client, with only the things you need, is simpler than using a generic one?
The specification is well written with requirements clearly stated. You can divide the specification into two main areas, (1) wire format and (2) behavior of clients and servers. Looking into alternatives I found that some packages, huin/mqtt, surgemq/message or eclipse/paho, do provided the wire format features, though most are for mqtt-v3.1.
I could have opted to used eclipse/paho, which has support for mqtt-v5, though I also wanted to experience and learn more about the process of implementing a package according to the specification, something you rarely get to do these days.
Putting other projects on hold, I decided to do this thing and set my goals accordingly.
Goal §
Provide a mqtt-v5.0 Go package for writing clients and servers.
Its design aims for
- Correctness - difficult to make protocol mistakes
- Performance - save the environment, conserve power
- Simplicity - optional abstractions
With these goals in mind approaching the solution is a tricky thing, though rewarding and fun.
Approach §
The entire specification is 137 pages. This is quite a lot of information to read before actually starting anything. Luckily the first section is terminology. It provides the necessary vocabulary and a general feel for what concepts are large and which are small.
Having poor experience with a top-down development approach I looked for the smallest concepts to implement first, like constants and error codes. However these are spread out throughout the specification so I stopped after a while and decided to move on with the first control packet, connect.
Representing the control packet with public fields felt idiomic. This approach is used in other packages and I was inclined to follow suit. However I already knew that in some cases the fields where related. Setting them from the outside would make it hard to fulfill the Correctness goal. So I went with the getter/setter approach, hiding all the fields.
Having pahos implementation I opted to early on write benchmark tests comparing my approach to theirs. Benchmarks between the initial implementation and pahos showed poor results in several areas, I needed a redesign.
Design §
At this point my design was limited to performance of control packet conversion to and from the wire format. But I really wanted to have a design that was at least in par with pahos, performance wise. The hidden fields with getter and setter methods had no affect on the performance so they stayed. But a lot of effort went into designing the wire types described in the specification in an efficient yet readable.
Reading and writing packets is deterministic as the length is provided. This trait is used in all the wire types to minimize allocations. Once the performance was adequate the remaining packets where fairly quickly implemented.
The module started out as mqtt
, obvious choice which
initially worked fine. Once the control packet types where implemented
focus shifted to clients and servers. This amount of code in package
mqtt was quite large already so I went with a subdirectory mqtt/x
and later renamed it to mqtt/proto
. The more I worked on client
behavior the naming felt wrong. Not only the naming, also the day to
day work where working in a subdirectory of the package was not
optimal, at least not in my setup. I want to quickly select the
project I work on and stay in that directory for most of the
time. This lead to another round of package renaming. Finally I
decided it was time to split the packages
- gregoryv/mq
- gregoryv/tt
Short and concise names, that are related but do not have to
be. I.e. someone else may want to write a generic client of sorts
using gregoryv/mq
. The packages also reflect the two major areas in
the specification (1) wire format and (2) behavior of clients and
servers.
Performance §
Before I go on, let me first say thank you to the eclipse/paho developers for their great work and I hope these results may give ideas to improving their already great package.
Initial comparison was on creation, writing and reading as separate tests.
goos: linux goarch: amd64 pkg: github.com/gregoryv/mq cpu: Intel(R) Xeon(R) E-2288G CPU @ 3.70GHz BenchmarkConnect/make/our 15082816 77.58 ns/op 24 B/op 3 allocs/op BenchmarkConnect/make/their 3935006 279.30 ns/op 512 B/op 5 allocs/op BenchmarkConnect/write/our 483277 2096.00 ns/op 48 B/op 1 allocs/op BenchmarkConnect/write/their 2359382 862.40 ns/op 368 B/op 10 allocs/op BenchmarkConnect/read/our 1553311 859.40 ns/op 440 B/op 8 allocs/op BenchmarkConnect/read/their 549508 2507.00 ns/op 3288 B/op 24 allocs/op
Writing a control packet uses one allocation but is still a lot slower than their version when it comes to writing. Though in the reading the roles are reversed, our version has fewer allocations and is quicker.
Using pprof I could find that the slowest part of writing a control packet was when writing fields defined as properties. Replacing a loop with direct access yielded quite an improvement
BenchmarkConnect/write/our 483277 2096.00 ns/op 48 B/op 1 allocs/op ... after... BenchmarkConnect/write/our 7871455 150.6 ns/op 48 B/op 1 allocs/op
BenchmarkAuth is faster when successful in pahos favor, though when including e.g. a reauthenticate with some user properties our implementation is faster. In the successful case we could optimize it even further I guess, though that could affect reading of other control packages.
At this point I decided to write a more complete benchmark that creates, writes and reads a control packet. A more reasonable comparison
Benchmark/Auth/our 1595908 850 ns/op 296 B/op 18 allocs/op Benchmark/Auth/their 396902 5372 ns/op 4208 B/op 43 allocs/op Benchmark/Connect/our 675033 1586 ns/op 880 B/op 16 allocs/op Benchmark/Connect/their 207224 5237 ns/op 5552 B/op 50 allocs/op Benchmark/Publish/our 504354 1990 ns/op 880 B/op 32 allocs/op Benchmark/Publish/their 609014 4074 ns/op 4064 B/op 41 allocs/op
The most important package Publish is still slower than pahos. Inlining the creation of packets as would be done in a real client we should get different results.
BenchmarkAuth/our 1808374 682 ns/op 264 B/op 17 allocs/op BenchmarkAuth/their 513357 4823 ns/op 4208 B/op 43 allocs/op BenchmarkConnect/our 785091 1311 ns/op 880 B/op 16 allocs/op BenchmarkConnect/their 205426 6685 ns/op 5552 B/op 50 allocs/op BenchmarkPublish/our 586962 1974 ns/op 688 B/op 31 allocs/op BenchmarkPublish/their 479336 2846 ns/op 4064 B/op 41 allocs/op
Not a huge difference, but still in the right direction.
Finally let me group benchmarks related to the publish control packet which can be argued is the control packet that will flow the most between a client and server.
BenchmarkPublish/our-16 813364 1667 ns/op 688 B/op 31 allocs/op BenchmarkPublish/their-16 459866 6305 ns/op 5792 B/op 43 allocs/op BenchmarkPublish/write/our-16 2817781 393 ns/op 80 B/op 1 allocs/op BenchmarkPublish/write/their-16 1587978 711 ns/op 472 B/op 10 allocs/op BenchmarkPublish/wqos0/our-16 9936145 120 ns/op 24 B/op 1 allocs/op BenchmarkPublish/wqos0/their-16 2453695 481 ns/op 408 B/op 9 allocs/op
Conclusion §
Packages gregoryv/mq and gregoryv/tt are still a work in progress though they are already useful.
What did I learn from doing this? Having a specification is great in that there is a source of truth which your implementation should follow. The format of the specification is good, though could use more fixed anchors e.g. for referencing directly from the implementation. I opted Not to referenece sections by their, what to me looks like generated anchor names. Those will probably change if the document is regenerated. Referencing would be very nice to have in both tests supporting specified requirements and code documentation explaining concepts.
References §
- mqtt-v5.0 specification
- paho.mqtt.golang implementation
- RFC2119 - Key words for use in RFCs to Indicate Requirement Levels
- huin/mqtt - wire format package for mqtt-v3.1
- surgemq/message - another design