Skip to content

Latest commit

 

History

History
81 lines (68 loc) · 3.48 KB

0140-2023-09-02.md

File metadata and controls

81 lines (68 loc) · 3.48 KB

2 Sep 2023

Previous journal: Next journal:
0139-2023-09-01.md 0141-2023-09-03.md

raybox-zero

Quick update for test005 run last night: Shared multiplier is a winner?

test005: Trying shared multiplier for trackInit

Click for details...

Code:

  • tt04-raybox-zero: 70f1a6c: harden_test: Minor update to change parameter order
    • Equivalent to: 7aae611: Wire up SPI for fixed pov
  • src/raybox-zero: 69f4dca: test005: Trying shared multiplier for trackInit

Summary:

  • test005 branches off test002 (Registered visualWallDist avoids post-subtraction) because this one seems to have the best performance so far (even if by a tiny margin).
  • Where we had this:
    wire `F2 trackInitX = stepDistX * partialX;
    wire `F2 trackInitY = stepDistY * partialY;
    ...it is now a single shared multiplier, driven by state, i.e. with mutiplicand and multiplier inputs selected by a mux on state...
    wire `F mul_in_a = (state==TracePrepX) ? stepDistX : stepDistY;
    wire `F mul_in_b = (state==TracePrepX) ?  partialX :  partialY;
    wire `F2 mul_out = mul_in_a * mul_in_b;
    ...and result being written to trackDistX and trackDistY respectively:
    TracePrepX:
        ...
        trackDistX <= `FF(mul_out);
        ...
    TracePrepY:
        ...
        trackDistY <= `FF(mul_out);
        ...

Options used:

  STARTED: 2023-09-01 22:39:33
    STOPT: 0
  OUTFILE: stats-test005.md
   SELECT: :[1245]
    FORCE: 0
      TAG: test005: Trying shared multiplier for trackInit
 FINISHED: 2023-09-01 23:16:12
4x2:1 8x2:1 4x2:2 8x2:2 4x2:3 8x2:3 4x2:4 8x2:4 4x2:5 8x2:5
suggested_mhz 24.39 25.00 47.62 50.00 0.00 0.00 45.52 44.80 25.00 25.00
utilisation_% 60.38 29.97 60.38 29.97 0.00 0.00 49.08 24.36 48.86 24.25
wire_length_um 0 258,488 0 260,097 0 0 181,794 190,478 180,108 174,155
TotalCells 11,037 36,469 11,037 36,491 0 0 20,672 35,751 20,509 35,604
cells_pre_abc 12,756 12,756 12,756 12,756 0 0 12,756 12,756 12,756 12,756
synth_cells 8,717 8,717 8,717 8,717 0 0 7,302 7,302 7,302 7,302
logic_cells 8,717 9,266 8,717 9,286 0 0 7,842 7,874 7,823 7,813
pin_antennas -1 2 -1 4 0 0 4 3 2 3
net_antennas -1 2 -1 3 0 0 3 3 2 3
spef_wns 0.00 0.00 0.00 0.00 0.00 0.00 -1.97 -2.32 0.00 0.00
spef_tns 0.00 0.00 0.00 0.00 0.00 0.00 -107.26 -128.96 0.00 0.00

Findings:

  • 4x2:4 fits, but doesn't reach 50MHz (gets closer though)
  • Speed isn't necessarily better, but area is much better at 48.86% for 4x2:5
  • Overall, shared multiplier seems to be a winner, and makes a notable difference for just reducing from 2 to 1.

Next Steps

  • Try texture mapping (i.e. bring back tex, and set up other addends) using shared multiplier too.
  • See if we can do something with shared multiplier for reciprocal too.
  • Try reducing size of shared multiplier.