-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
saving measurements during compile and run time #108
base: main
Are you sure you want to change the base?
Conversation
@quic-mamta Please review. |
QEfficient/cloud/infer.py
Outdated
|
||
compile_time = "NA" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you remove this, it's not required. as when benchmark=True, this variable will always exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assignment is needed for if the model was pre-compiled, otherwise compile_time will be unassigned. I changed the assignment as compile_time = "pre-compiled" for this case.
Please see https://github.com/quic-morteza/efficient-transformers/blob/8f6abc5a57bd369c4deb1cea8d18725ddc57e2a2/QEfficient/cloud/infer.py#L92
QEfficient/utils/logging_utils.py
Outdated
with open(file, "a", newline="") as csvfile: | ||
csvwriter = csv.writer(csvfile) | ||
csvwriter.writerow(list(fields.values())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will keep on writing to the same CSV, as long as nobody deletes it.
Should we always create new CSV with timestamp and let user decide how to combine those CSVs.
If single CSV keeps updating, it might have too much data and it might be hard to distinguish which data is relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it is needed to accumulate the measurements for the same model in the same CSV file via various runs. Later it is easy for the user to sort through this file and compare against the previous measurements.
Thanks for the reviews. Here is how the gpt2_benchmarking.csv look like after running the following command two times: python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 4 --prompt_len 64 --ctx_len 1024 --mxfp6 --num_cores 16 --prompt "My name is|My name is|My name is|My name is" --benchmark |
@@ -35,6 +42,7 @@ def main( | |||
local_model_dir: Optional[str] = None, | |||
cache_dir: Optional[str] = None, | |||
hf_token: Optional[str] = None, | |||
benchmark: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the doc string with this flag usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't find. Can you point me to the right doc string path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can add it on line number 71
@quic-morteza please resolve the formatting issue and DCO error. |
fd8427f
to
961ee25
Compare
Signed-off-by: quic-morteza <quic_morteza@quicinc.com>
Signed-off-by: quic-morteza <quic_morteza@quicinc.com>
Signed-off-by: quic-morteza <quic_morteza@quicinc.com>
961ee25
to
26570ce
Compare
): | ||
input_len = max([len(x) for x in tokenizer(prompt, return_tensors="np").input_ids]) | ||
|
||
fields = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two more fields mos and aic_enable_depth_first can be added here.
@@ -106,10 +120,13 @@ def main( | |||
full_batch_size=full_batch_size, | |||
) | |||
|
|||
compile_time = (time.perf_counter() - compile_start_time) // 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also be moved under the if condition of benchmark flag, also why keeping it integer?
@quic-morteza Can you address the review comments? |
For the user to keep track of their measurements, code is added to log the necessary arguments and the measurements into a csv file. E.g. running the following command two times:
python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 4 --prompt_len 64 --ctx_len 1024 --generation_len 512 --mxfp6 --num_cores 16 --device_group [0] --prompt "My name is|My name is|My name is|My name is" --benchmark
generates gpt2_benchmarking.csv