Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

saving measurements during compile and run time #108

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

quic-morteza
Copy link
Contributor

For the user to keep track of their measurements, code is added to log the necessary arguments and the measurements into a csv file. E.g. running the following command two times:

python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 4 --prompt_len 64 --ctx_len 1024 --generation_len 512 --mxfp6 --num_cores 16 --device_group [0] --prompt "My name is|My name is|My name is|My name is" --benchmark

generates gpt2_benchmarking.csv
image

@ochougul
Copy link
Contributor

@quic-mamta Please review.

QEfficient/cloud/infer.py Outdated Show resolved Hide resolved
Comment on lines 87 to 93

compile_time = "NA"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove this, it's not required. as when benchmark=True, this variable will always exist

Copy link
Contributor Author

@quic-morteza quic-morteza Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment is needed for if the model was pre-compiled, otherwise compile_time will be unassigned. I changed the assignment as compile_time = "pre-compiled" for this case.
Please see https://github.com/quic-morteza/efficient-transformers/blob/8f6abc5a57bd369c4deb1cea8d18725ddc57e2a2/QEfficient/cloud/infer.py#L92

QEfficient/cloud/infer.py Outdated Show resolved Hide resolved
Comment on lines 68 to 70
with open(file, "a", newline="") as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerow(list(fields.values()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will keep on writing to the same CSV, as long as nobody deletes it.
Should we always create new CSV with timestamp and let user decide how to combine those CSVs.

If single CSV keeps updating, it might have too much data and it might be hard to distinguish which data is relevant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is needed to accumulate the measurements for the same model in the same CSV file via various runs. Later it is easy for the user to sort through this file and compare against the previous measurements.

@quic-morteza
Copy link
Contributor Author

Thanks for the reviews. Here is how the gpt2_benchmarking.csv look like after running the following command two times:

python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 4 --prompt_len 64 --ctx_len 1024 --mxfp6 --num_cores 16 --prompt "My name is|My name is|My name is|My name is" --benchmark

image

@@ -35,6 +42,7 @@ def main(
local_model_dir: Optional[str] = None,
cache_dir: Optional[str] = None,
hf_token: Optional[str] = None,
benchmark: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the doc string with this flag usage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't find. Can you point me to the right doc string path?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add it on line number 71

QEfficient/utils/_utils.py Show resolved Hide resolved
@quic-rishinr
Copy link
Contributor

@quic-morteza please resolve the formatting issue and DCO error.

quic-morteza and others added 3 commits October 22, 2024 13:06
Signed-off-by: quic-morteza <quic_morteza@quicinc.com>
Signed-off-by: quic-morteza <quic_morteza@quicinc.com>
Signed-off-by: quic-morteza <quic_morteza@quicinc.com>
):
input_len = max([len(x) for x in tokenizer(prompt, return_tensors="np").input_ids])

fields = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more fields mos and aic_enable_depth_first can be added here.

@@ -106,10 +120,13 @@ def main(
full_batch_size=full_batch_size,
)

compile_time = (time.perf_counter() - compile_start_time) // 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be moved under the if condition of benchmark flag, also why keeping it integer?

@quic-rishinr
Copy link
Contributor

@quic-morteza Can you address the review comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants