Skip to content

Commit

Permalink
hw: Updates to assignment 2
Browse files Browse the repository at this point in the history
  • Loading branch information
powerjg committed Dec 19, 2024
1 parent c6fa640 commit b29dbbd
Show file tree
Hide file tree
Showing 6 changed files with 105 additions and 26 deletions.
40 changes: 18 additions & 22 deletions assignment-2/assignment.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ Models for the board, cache hierarchy, and memory will remain a constant in your
See the code in `components/processors.py` for more information.
- Cache models: You will only use `MESITwoLevelCache` in this assignment.
- Memory models: You will only use `DDR4` in this assignment.
- Clock frequency: You can use a clock frequency of `4 GHz` for all of your simulations.
- Clock frequency: Use a clock frequency of `1 GHz` for all of your simulations.

### Region of Interest (ROI)

Expand Down Expand Up @@ -157,7 +157,7 @@ Complete the following steps and answer the questions for your report.
Collect data from your simulation runs and use simulator statistics to answer the questions.
Use clear reasoning and visualization to drive your conclusions.

### Step I: Write down your hypotheses and experimental setup
### Step I: Write down your hypotheses

Before starting simulation and analysis, you should be able to identify the ROI of a program.

Expand Down Expand Up @@ -187,7 +187,7 @@ This statistic represents a distribution of different operation classes executed

Now, that we have the instruction mix, let's answer the following questions (the same as above).

1. For the ROI of this workload, what percentage of instructions do you think will be integer, floating point, and memory operations? Explain your reasoning.
1. For the ROI of this workload, what percentage of instructions are integer, floating point, and memory operations? Explain your reasoning.
2. Estimate how the performance will change under the following conditions:
a. If the latency of integer operations are increased from 1 to 6 cycles, but the system is pipelined.
b. If the latency of floating point operations are increased from 6 to 12 cycles, but the system is pipelined.
Expand Down Expand Up @@ -222,7 +222,7 @@ Design three experiments to test your hypothesis about the performance impact of

### Research question:

Use the data from the experiements that you ran in [Step II](#step-ii-developing-and-running-the-experiments) to answer the following questions.
Use the data from the experiments that you ran in [Step II](#step-ii-developing-and-running-the-experiments) to answer the following questions.

1. Does changing the latency of the integer operations, floating point operations, or the issue latency have a bigger impact on the performance of the system?
2. Are these changes the main factor in the performance of the system for the DAXPY workload? If not, what other factors might be affecting the performance of the system?
Expand All @@ -231,23 +231,19 @@ Use the data from the experiements that you ran in [Step II](#step-ii-developing

- Take a look at the assembly code for the `DAXPY` loop below (you can generate the complete assembly for it under `workloads/daxpy` with the makefile).
Can you find some dependencies between the instructions?
Do you think only looking at the instruction mix gathered from [Step I](#step-i-write-down-your-hypotheses-and-experimental-setup) provided enough information to apply instruction mix and the Iron Law?
Do you think only looking at the instruction mix gathered from [Step I](#step-i-write-down-your-hypotheses) provided enough information to apply instruction mix and the Iron Law?
- Think about the other stages of the pipeline, in this question we have only focused on **decode** and **execute**.

```asm
.L35:
# daxpy.cpp:27: Y[i] = alpha * X[i] + Y[i];
fld fa4,0(a5) # MEM[(double *)_56], MEM[(double *)_56]
fld fa5,0(s2) # MEM[(double *)_49], MEM[(double *)_49]
# daxpy.cpp:25: for (int i = 0; i < N; ++i)
addi a5,a5,8 #, ivtmp.133, ivtmp.133
addi s2,s2,8 #, ivtmp.132, ivtmp.132
# daxpy.cpp:27: Y[i] = alpha * X[i] + Y[i];
fmadd.d fa5,fa5,fa3,fa4 # _5, MEM[(double *)_49], tmp181, MEM[(double *)_56]
# daxpy.cpp:27: Y[i] = alpha * X[i] + Y[i];
fsd fa5,-8(a5) # _5, MEM[(double *)_56]
# daxpy.cpp:25: for (int i = 0; i < N; ++i)
bne s1,a5,.L35 #, _14, ivtmp.133,
call m5_work_begin@plt #
# daxpy.cpp:27: Y[i%N] = alpha * X[i%N] + Y[i%N];
li a3,3153920 # tmp338,
fld fa3,.LC4,a5 # tmp291,, tmp304
li a1,-3145728 # tmp257,
addi a5,a3,1824 #, tmp337, tmp338
add a5,a5,a1 # tmp257, tmp337, tmp337
...
# daxpy.cpp:32: m5_work_end(0,0);
```

**NOTE**: Make sure to keep the simulation output for all of your simulation runs for your later analyses.
Expand All @@ -257,7 +253,7 @@ Do you think only looking at the instruction mix gathered from [Step I](#step-i-
Answer the following questions in your report.
Include information on how you designed the experiment, what you measured, and the analyzed the data.

1. Change the clock to be lower (e.g., 1 GHz). How does a lower clock affect the answers to the two research questions?
1. Change the clock to be higher (e.g., 4 GHz). How does a higher clock affect the answers to the two research questions?

## Submission

Expand Down Expand Up @@ -286,9 +282,9 @@ The code included in the "Example command to run the script" section should be a
## Grading

- **25 points** gem5 runscript and explanation of how to use your script
- **50 points** for the questions in the report
- **25 points** for the research question
- **10 points** for the next steps
- **45 points** for the questions in the report
- **30 points** for the research question
- **25 points** for the next steps

## Academic misconduct reminder

Expand Down
83 changes: 83 additions & 0 deletions assignment-2/questions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Assignment 2 Questions

**IMPORTANT** Do not reformat this file!
Put your answers below each question.
Use markdown formatting.

## [25 points] How to reproduce the results

### Explanation of the script

### Script to run

### Parameters to script (if any)

### Commands used to gather data

#### Commands used for Step II

```shell

```

#### Commands used for Step III

```shell

```

```shell

```

```shell

```

## [75 points] Questions

### [10 points] Step I: Write down your hypotheses

1. For the DAXPY's assembly code, identify the ROI. In your report, copy the assembly code segment corresponding to the code between `m5_work_begin` and `m5_work_end`.

2. For the ROI of this workload, what percentage of instructions do you think will be integer, floating point, and memory operations? Explain your reasoning.

3a.Estimate how the performance will change under the following condition: If the latency of integer operations are increased from 1 to 6 cycles, but the system is pipelined.

3b.Estimate how the performance will change under the following condition: If the latency of floating point operations are increased from 6 to 12 cycles, but the system is pipelined.

3c.Estimate how the performance will change under the following condition: If the issue latency is increased from 1 to 2 cycles, but the operation latency is unchanged (1 cycle for integer and 6 cycles for floating point operations).

### [20 points] Step II: Get preliminary data on the instruction mix

1. For the ROI of this workload, what percentage of instructions are integer, floating point, and memory operations? Explain your reasoning.

2a.Estimate how the performance will change under the following condition: If the latency of integer operations are increased from 1 to 6 cycles, but the system is pipelined.

2b.Estimate how the performance will change under the following condition: If the latency of floating point operations are increased from 6 to 12 cycles, but the system is pipelined.

2c.Estimate how the performance will change under the following condition: If the issue latency is increased from 1 to 2 cycles, but the operation latency is unchanged (1 cycle for integer and 6 cycles for floating point operations).

### [15 points] Step III: Developing and running the experiments

1a. What is the baseline and what is the change to the system under test when changing the latency of integer operations.

1b. What is the baseline and what is the change to the system under test when changing the latency of floating point operations.

1c. What is the baseline and what is the change to the system under test when changing the issue latency.

2a. What is the performance change for changing the latency of integer operations.

2b. What is the performance change for changing the latency of floating point operations.

2c. What is the performance change for changing the issue latency.

### [30 points] Research questions:

1. Does changing the latency of the integer operations, floating point operations, or the issue latency have a bigger impact on the performance of the system?

2. Are these changes the main factor in the performance of the system for the DAXPY workload? If not, what other factors might be affecting the performance of the system?

### [25 points] Next steps

1. Change the clock to be higher (e.g., 4 GHz). How does a higher clock affect the answers to the two research questions?
Binary file modified workloads/daxpy/daxpy
Binary file not shown.
Binary file modified workloads/daxpy/daxpy-gem5
Binary file not shown.
4 changes: 2 additions & 2 deletions workloads/daxpy/daxpy.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ int main()
#endif

// Start of daxpy loop
for (int i = 0; i < N; ++i)
for (int i = 0; i < N*10; ++i)
{
Y[i] = alpha * X[i] + Y[i];
Y[i%N] = alpha * X[i%N] + Y[i%N];
}
// End of daxpy loop

Expand Down
4 changes: 2 additions & 2 deletions workloads/resources.json
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@
"24.1"
],
"url": "file:///workspaces/gem5-assignment-template/workloads/daxpy/daxpy",
"md5sum": "dcffea8806175a3ea40c521eabd4661c"
"md5sum": "929f9f86b1d84ed009f96d54f8f3e8a0"
},
{
"category": "binary",
Expand All @@ -125,7 +125,7 @@
"24.1"
],
"url": "file:///workspaces/gem5-assignment-template/workloads/daxpy/daxpy-gem5",
"md5sum": "4515e0c99e62c67f29c969ad71375fcf"
"md5sum": "223d6b38ef1ece75beaecb825ea2c319"
},
{
"category": "workload",
Expand Down

0 comments on commit b29dbbd

Please sign in to comment.