You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ah, interesting idea. So if I understand correctly, there will still be
four reads, but the overreads could be directed somewhere safe, making
the interface a bit nicer.
I see two downsides:
1. Those reads are now each dependent on an offset table lookup, which
depends on the length. Currently the reads can be performed even before
the length is known.
2. In theory, on alignment-relaxed architectures like x86, the compiler
could currently coalesce those four reads into a single 4-byte read.
Using dynamic offsets would prevent this. However, neither GCC nor Clang
currently seem to take this approach anyway.
there's no such thing as a read that is smaller than a cache line anyways, so consuming the first byte to calculate length then using a table (or some masking technique) would be fine, though it's hard to say how it'll perform in practice.
it's documented behavior so not strictly a bug, however you could avoid overreads by by computing the offsets from a table, there's only a few.
The text was updated successfully, but these errors were encountered: