Skip to content

Latest commit

 

History

History
469 lines (352 loc) · 29.6 KB

README.md

File metadata and controls

469 lines (352 loc) · 29.6 KB

VSD-IAT Workshop on Advanced Physical Design using OPENLANE/SKY130

CONTENTS

DAY 1 : Inception of Opensource EDA, OpenLANE and SKY130 PDK

Consider an Arduino board. The design of such a microcontroller is dealt in Embedded Systems desing. There is an important block in this microcontroller, which is the microprocessor. The design of such microprocessors or other such chips is done in the VLSI industry. It involves desinging the chip from an abstract logical point of view down to fabricating the chip on a real semiconductor wafer. This flow is generally known as RTL to GDSII flow. And that is what we will explore in this project.

Few common VLSI terms

The image shown is of a typical chip

  • DIE - The outermost white border is called DIE. Silicon wafers are divided into Dies and each such die can be an independant chip.
  • PADS - The blue segments just within the die border are called PADS. They contain pins which the chip uses to communicate with the external world.
  • CORE - The central black region is called the CORE. Core is the main part of the chip. It contains all the different functional blocks that handle all the processes the chip is designed to perform.
  • IP is Intellectual Property. It refers to the funcational blocks desinged for a specific purpose.
  • Foundry is a semiconductor fabrication plant where devices such as integrated circuits are manufactured. They provide all the necessary files required to design an IC which can be taped out in their plant.
  • PDK - Process Design Kit. It is a collection of files used to model a fabrication process for the EDA tools used to design an IC. It contains process design rules, device models, standard cell libraries, I/O libraries etc.
  • RTL - Registre Transfer Level. It is a gate level netlist corresponding the logical functionality of the design. It is defined using Hardware Description Languages(HDL) auch as verilog and VHDL.

RTL to GDSII flow

  • SYNTHESIS - Conversion of RTL to a circuit consisting of components from a Standard Cell Library(SCL). Standard Cell Library is a collection of cells with certain functionality like AND gate, Or gate, etc. with a fixed height and variable width( which is an integer multiple of certain discrete units called Site Widths).
  • FLOOR AND POWER PLANNING - Abstract layout of the entire chip is planned.
    • Chip floor planning - Partition of chip die into different blocks and placement of i/o pads.
    • Macro floor planning - Dimensions of the blocks are estimated, locations of pins are decided and rows are defined
    • Power planning - Locations of power pads and power straps are decided. It consists of many upper layer metals arranged parallely for uniform power distribution across the entire chip
  • PLACEMENT - Cells are placed on the floorplan constructed in the previous step. Two steps :
    • Global placement - Finding optimal positions for all cells
    • Detailed placement - Placement obtained from global placement are further optimized
  • CLOCK TREE SYNTHESIS - Creation of clock distribution network to ensure that clock is delivered with minimum skew and in good shape to all the sequential elements. Usually implemented as a H-tree.
  • ROUTING - Using available metal layers to interconnect the cells and blocks.
    • Global Routing - Routing guides are generated
    • Detailed Routing - Uses routing guides to implement actual wiring
  • SIGN OFF - Various verifications are performed
    • Physical verifications - Design Rule Checks (DRC) and Layout vs Schematic (LVS)
    • Timing verifications - Static Timing Analysis (STA)

About Openlane

Openlane is an open source flow for a true open source tape-out experience. It is a culmination of various open source EDA tools. It's main goal is to produce a clean GDSII without human intervention. Openlane is tuned for SkyWater 130nm open pdk.

Openlane ASIC flow :

  • RTL Synthesis - Implemented using Yosys and abc
  • Static timing analysis - Implemented using OpenSTA
  • Design for Testability (DFT) - Implemented using Fault
  • Physical implementation - Implemented using OpenROAD. Involves Place and Route(PnR) and Clock Tree Synthesis (CTS)
  • Logical Equivalence Checking (LEC) - Implemented using Yosys. To ensure functional equivalence after netlist is modified during optimizations
  • RC Extraction

LAB 1 : Getting started with OpenLane

Openlane comes with many built in designs. In this project, we will be exploring the flow with one such design, picorv32a. It is a CPU core and we will see how all the steps in RTL to GDSII flow are implemented in Openlane by using this chip. And a few directory names mentioned in this project might be user specific, but most of them will be same. For example, openLANE_flow directory mentioned in this project is named just openlane typically.

To start openlane, we open the shell in openLANE_flow(openlane) directory and run the command,

./flow.tcl -interactive

Now we import openlane packages specifying its version,

package require openlane 0.9

Next we specify the design that we intend to work on, which is picorv32a in our case,

prep -design picorv32a

This command merges two lefs and places it in a new folder which is named as date and time while running the command, inside directory designs/picorv32a/runs/.

Running synthesis

run_synthesis

This runs the synthesis where yosys translates RTL into circuit using generic components and abc maps the circuit to Standard Cells.

Here we define a term Flop Ratio. Flop ratio is the ratio of total number of flip flops to total number of cells present in the design.

DAY 2 : Floorplanning and introduction to Library Cells

FLOORPLANNING

1. Defining width and height of Core and Die

First step in the floorplan is to define the dimensions of core and die, which in turn contraints the dimensions of the SoC and the IPs contained in it. We define two terms in this regard - Utilization Factor and Aspect Ratio.

Utilization Factor- Utilization factor represents the percentage of the core area occupied by the netlist(with cells abutting each other and excluding the wires). So it is defined as the ratio of Area ocupied by the netlist and Total area of the core.

Aspect Ratio- Aspect ratio is the ratio of Height and Width of the core and tells if the core is rectungular or square.

2. Defining locations of pre-placed cells

In the netlist, there will be some portions which repeat many times at different locations. So, we divide the entire netlist into certain blocks so that the repeating blocks can be duplicated easily as and when required. These blocks are placed on the floor before runnning the autommated PnR, and hence the name pre-placed cells. Automated tools cannot re-locate these pre-placed cells.

3. Surronding pre-placed cells with de-coupling capacitors

When the power supply wire is long, the parasitics in the wire result in a voltage drop. Thus, all the blocks do not get the necessary power required by them. So, we de-couple the blocks using capacitors which can feed the necessary voltage to the blocks. This also reduces cross talk.

4. Power Planning

All the blocks cannot be provided with de-coupling capacitors as it would increase the chip area. And also, having a single supply and ground line for all the blocks might result in ground bounce and voltage droop effects. To tackle this, we add multiple supply lines.

5. Pin Placement

All the input ports are placed on one side and the output ports on the other side. Ordering might depend on block placement.

6. Logical cell placement blockage

Area for the pins are reserved, thus blocking that area for automated PnR tools.

LAB 2 : Floorplanning, placement and magic

Few useful flags and commands

  • To create runs folder with custom name
prep -design picorv32 -tag trial_run1

This creates a new runs folder with the name trial_run1

  • To overwrite previous run
prep -design picorv32 -tag trial_run1 -overwrite
  • To change variables in current run
set env(CLOCK_PERIOD) 15.000

Sets the clock period to 15

  • To view variables in current run
echo $env(CLOCK_PERIOD)

Running floorplan

run_floorplan

After running the above command, a new file named piorv32a.floorplan.def will be created in the directory runs/trial_run1/results/floorplan/ which looks like this,

The DIEAREA variable contains the (x1 y1)(x2 y2) co ordinates where x1,y1 is the lower left vertex and x2,y2 is the upper right vertex of the die. This information can be used to calculate the area of the die.

Opening floorplan in MAGIC

To view the floorplan created, we need to open it in magic as follows,

magic -T /home/mayurta/Desktop/work/tools/openlane_working_dir/pdks/sky130A/libs.tech/magic/sky130A.tech lef read ../../tmp/merged.lef def read picorv32a.floorplan.def &

The above commmand first reads the tech file which is sky130A.tech, reads lef file which is merged.lef and def file which is picorv32a.floorplan.def.

In the layout, many i/o pins can be seen at the border of the layout, which are equidistant from each other by default(which can be changed in the /home/mayurta/Desktop/work/tools/openlane_working_dir/openLANE_flow/configuration/README.md file).

And many tap cells can be seen all over the layout, whcih connect n-well to Vdd and substrate to ground to prevent latch-up. These tap cells are diagonllay equidistant from each other.

A few standard cells can also been at the lower left corner of the layout.

Running placement

The following command places all the standard cells pertaning to the netlist, on the floorplan created from the previous step.

run_placement

All the checks should be passed as follows,

Opening floorplan in MAGIC

Now open the just created piorv32a.placement.def in magic using the command similar to the one from previous step.

magic -T /home/mayurta/Desktop/work/tools/openlane_working_dir/pdks/sky130A/libs.tech/magic/sky130A.tech lef read ../../tmp/merged.lef def read picorv32a.placement.def &

DAY 3 : Designing library cell using MAGIC layout and ngspice charactereization

16 mask CMOS process

A CMOS inverter is fabricated on actual silicon wafer using 16 masks. The process contains of various steps as follows :

  • Selecting a substrate
  • Creating active regions for transistors
  • Formation of n-well and p-well
  • Formation of gates
  • Lightly doped drain (LDD) formation
  • Source and drain formation
  • Formation of contacts and interconnects
  • Higher metal level formation

LAB 3 : Simulation and characterization of an inverter and plugging it into picorv32

Setting up the inverter files

Instead of designing the inverter from scratch, we git clone the folder containing a pre-designed inverter and work with it. The link to be cloned from was already given in the workshop. We first go to the openLANE_flow(openlane) directory and clone the inverter there as follows,

git clone https://github.com/nickson-jose/vsdstdcelldesign.git

This command creates a new folder named vsdstdcelldesign inside our openLANE_flow folder.

Now, we copy the tech file into the vsdstdcelldesign directory and open the inverter design with magic. For copying, go to the directory where tech file is present i.e pdks/sky130/libs.tech/magic and use the command cp sky130A.tech ABSOLUTE_PATH_TO_VSDSTDCELLDESIGN as follows,

Opening the inverter in MAGIC

Now, we can open the inverter in magic by typing,

magic -T sky130.tech sky130_inv.mag

To simulate the inverter, we need a .spice file corresponding to the .mag file. We first extract the .mag file, whcih creates a .spice file in the same directory.

Then we convert the .ext into .spice including all the parasitics.

Then we edit the .spice file to include model files, define power supply nodes and analysis type.

Runing the simulation in ngspice

Next, we run the simlulation by typing,

ngspice sky130_inv.spice

To plot the simulation results,

plot y vs time a

This plots output(node y) vs time and also the input(node a)..

Timing characterization of the cell can be performed in ngspice by calculating delays and transition times.

DAY 4 : Pre-layout timing analysis and importance of good clock tree

LAB 4 : Clock tree synthesis and tritonCTS

MAGIC contains all the detailed information about a cell. For PnR, such detailed information is not necessary. So, we use a different file format LEF for placement and routing stage. LEF( Library Exchange Format) contains only the abstract information about the cell and hence is also used for protecting the IPs. So, before plugging our Inverter into the layout of picorv32, we need to convert the .mag file of inverter into .lef.

For routing, certain guidelines are to be strictly followed. Two of such guidelines relevant in our case are,

  1. The input and output ports must lie on the intersection of horizontal and vertical tracks
  2. Width of the standard cell must be odd multiples of track pitch and height must be odd multiples of vertical track pitch

Verifying the gudidelines and coverting to lef file

Tracks are like lines used by the PnR to place the metal wires for routing. The track information can be found in the file tracks.info inside the directory pdks/sky130A/libs.tech/openlane/sky130A_fd_sc_hd.

Each line contains a X(horizontal) or Y(vertical) track info with the first number representing track offset and the second number is track pitch.

To check whether the first guideline is followed by our inverter, we identify the input and output ports and check if they lie on the intersection of tracks of the corresponding metal by aligning the grids in MAGIC layout to that of the tracks using the grid command in tkcon window. In our case, the porst lie on licon metal, so we align the grid corresponding to those values,

grid 0.46um 0.34um 0.23um 0.17um

We see that the ports do lie intersection of tracks. Next the second guideline is also verfified by counting the number of boxes covered the inverter along length and breadth.

Next we rename the inverter mag file(not necessary) and extract the lef file by typing the command in tkcon window,

lef write

This creates a new file the same directory.

Plugging the inverter lef file into picorv32a

For plugging the inverter into picorv32, wee first copy the inverter lef file into the src directory inside picorcv32.

We aslo require the tool to map inverter cell design and picorv32. So also copy the library files into src.

For Openlane to recognise our inverter inside picorv32, we add the following lines in to the config.tcl file which is inside pirorv32 directory,

set ::env(EXTRA_LEFS) [glob $::env(OPENLANE_ROOT)/designs/$::env(DESIGN_NAME)/src/*.lef]

We also add these other lines inside the same config.tcl for openlane to recognise the timing information of our inverter,

Next we open the Openlane flow, require packages and prep the design. Then we run the following commands in Openlane window so that lef file of our inverter gets addeed to merged lef file.

set lefs [glob $::env(DESIGN_DIR)/src/*.lef]
add_lefs -src $lefs 

Then we run the synthesis.

There is huge timing violations. Here wns is worst negative slack and tns is total negative slack. So, now we should some changes and make our flow more timing driven. We check for three variables(variables in README.md file present inside openLANE_flow/configuration directory):

  • SYNTH_STRATEGY - We try to strike a balance between area and delay by using an appropriate strategy. The default strategy tunrs out to be 2 which is more area driven. So, we set the strategy to 1, which is more delay oriented. This might result in a bit increased area, but delay will be reduced
  • SYNTH_BUFFERING - This adds buffers to high fan_out lines. It would be better if it is ON
  • SYNTH_SIZING - This varies the size of the cells in the flow. This also is betterr to be ON
set ::env(SYNTH_STRATEGY) 1
set ::env(SYNTH_SIZING) 1

Running synthesis again, we find that the area has increased and timing has improved.

We once confirm if the inverter did get added into picorv32 by checking the merged.lef in runs/finalrun/tmp.

Yes! Inverter is found in the picorv32a merged.lef. So, next we run floorplan and placement.

Timing analysis in OpenSTA

Next we try to improve the timing still more by using OpenSTA. Before that we need to set it up first. We need two files with format .sdc and .conf, in our case, my_base.sdc and sta.conf. These files were already available with us in the exatras directory of the cloned vsdstdcelldesign folder. We copy the .sdc file into src directory of picorv32a. Then we modify the contents in .conf as follows, specifying the paths to respective .lib files and .sdc file.

The my_base.sdc file in our case looks like this,

And we copy the .conf file into openLANE_flow directory. There we open terminal and type sta sta.conf. This opens and runs our timing files in OpenSTA. The results are as follows,

By scrolling up, we can see that fanout of the nets are more. We now go back to the openlane window and set SYNTH_MAX_FANOUT to 4 and run OpenSTA again.

The timing sure has improved. But it is better to get it below -1. Next optimization we perform is, we scroll up and look for nets with version1 buffers, having more capacitance and driving more fanouts. We upsize such buffers by replacing them with version4 buffers. Here is one such buffer,

We run the following commands to get more information it and replace and run the analysis again,

Timing has improved again as expected.

The replacing of cells modifies the local copy of netlist. So now we push the changes made to the netlist into the original file present in picorv32a/runs/finalrun/results/synthesis/. We use the command write_verilog location-of-the-verilog-file. We need to keep in mind that the modification has been to the .v file present in synthesis stage and that when we run Openlane again, we should run synthesis again, doing which will undo all the changes done inside OpenSTA stage.

Then we run floorplan and placement again as we have modified the netlist.

Clock tree synthesis

Clock tree synthesis is performed by TritonCTS. It is run by the following command,

run_cts

After clock tree synthesis we perform timing analysis again. Instead of running OpenSTA from outside Openlane, we can run it in the flow itself inside Openroad. The following set of commands descride the steps,

Running OpenSTA using OpenRoad

openroad
read_lef location-of-lef                  (i.e. runs/finalrun/tmp/merged.lef
read_def location-of-def                  (i.e runs/finalrun/results/cts/picorv32a.cts.def)

After the .lef and .def have been read, we need to create a db

write_db db-name                          (i.e. pico_cts.db)
read_db db-name

Then we read other required files

read_verilog verilog-file-location        (i.e. finalrun/results/synthesis/picorv32a.synthesis_cts.v)
read_liberty $::env(LIB_SYNTH_COMPLETE)
link_design design-name                   (i.e. picorv32a)
read_sdc location-of-sdc                  (i.e. /openLANE_flow/designs/picorv32a/src/my_base.sdc)
set_propagated_clocks [all_clocks]
report_checks -path_delay min_max -format full_clock_expanded -digits 4

DAY 5 : Final steps for RTL2GDS using tritonRoute and openSTA

LAB 5 : Introduction to routing using tritonRoute

Routing

To check which stage was last executed in Openlane, we can use the following command,

echo $::env(CURRENT_DEF)

The command basically tells us which .def file was updated last time which also corresponds to the previously executed stage.

NOTE : Power-ground distribution is generally done durimg floorplan. But in Openlane, is is done after placement.

Before routing, we first need to generate the power distribution network.

gen_pdn

We can check for the environment variables available for Routing for more control over the process(inside the README.md in openLANE_flow/Configuration/). One such variable is ROUTING_STRATEGY. A value between 0-3 makes the routing runtime faster but compromised optimization(around half an hour). A value of 14 results in a highly optimized routing but takes around 4-5 hours for completion.

run_routing

Routing is performed in two stages:

  • Fast route - Implemented using FastROAD. It generates routing guides.
  • Detailed route - Implemented using TritonRoute. It uses the routing guides generated in fast route to find the best route and makes connections

SPEF extraction

SPEF - Standard Parasitc Exchange Format gives information about all the parasitics present in the circuit. But SPEF extractor is not yet included in Openlane. But in this lab, we have an external spef extractor. Go to directory work/tools/SPEF_EXTRACTOR and type,

python3 main.py location_of_def location_of_lef          

where location_of_def was runs/finalrun/tmp/merged.lef and location_of_lef was runs/finalrun/results/routing/picorv32a.def.

After extraction, a new spef file gets created in the same directory as of def file. Now, in runs/finalrun/results/synthesis/, we find the following .v files,

The picorv32a.synthesis_diodes.v gets created just before routing, when antenna diode insertion takes place. To perform STA again, we need to use the last file.

ACKNOWLEDGEMENTS