Previous journal: | Next journal: |
---|---|
0062-2023-04-17.md | 0064-2023-04-21.md |
- Division in Verilog - Expect Q8.8 division to take 24 cycles.
- Start with a simple (even naive) divider, to prove we can get things to work, then see if we can optimise later...
- Iterative division and multiplication could be short-circuited for dividends with leading zeros?
- Is there a fast reciprocal we can use, even if approximated?
- https://github.com/ameetgohil/reciprocal-sv
- https://www.shironekolabs.com/posts/efficient-approximate-square-roots-and-division-in-verilog/
- https://witscad.com/course/computer-architecture/chapter/fixed-point-arithmetic-division
- https://www.coranac.com/tonc/text/fixed.htm
- https://homepage.cs.uiowa.edu/~dwjones/bcd/divide.html
- https://stackoverflow.com/a/25765597
- https://blog.segger.com/algorithms-for-division-part-4-using-newtons-method/
- https://stackoverflow.com/questions/6286450/inverse-sqrt-for-fixed-point
- Also look for optimisations that involve known short-cuts.
- Is there a successive-approximation we could use with a LUT?
- If we use a local memory (i.e. buffer trace data inside the design instead of external RAM) then we get a few benefits:
- Faster
- Can be any topology we like, hence maybe simpler for our design's specific needs
- Fewer external parts (and IO) required
- A local memory could be regular RAM (e.g. OpenRAM), DFF array, or maybe best: We could use a FIFO memory for this design, since individual addressing is not needed and already-used values can be discarded.
- If we use an external memory, this can still work. We just need to make sure:
- We store and retrieve enough bits in parallel for all of the different types of data we need (e.g. height/distance, texture, texel, side).
- We separate reads (visible area) from writes (processing time during VBLANK).
- Access is fast enough. Writing is not so bad, but assuming full 640x resolution, we need to read on every clock at 25MHz, i.e. 40ns.
- Is fixed-point or floating-point actually better for our hardware?
- In SystemVerilog:
- What is
logic
andwire logic
vs. justwire
? always_comb
can (must) do direct assignment (=
), vs.always_ff
requires (<=
).
- What is
- Affine matrix transformation: https://www.coranac.com/tonc/text/affine.htm