Skip to content

Commit

Permalink
Fix divide by zero exception for responses of length 1 (#902)
Browse files Browse the repository at this point in the history
first commit
  • Loading branch information
Bslabe123 authored Dec 5, 2024
1 parent b1befbe commit 35397cd
Showing 1 changed file with 2 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,8 @@ async def send_stream_request(
request_latency = (prompt_len, output_len, (request_end_time - request_start_time))

# Exclude first token for tpot calculation
tpot_metric.observe((request_end_time - ttft - request_start_time) / (output_len - 1))
if output_len > 1:
tpot_metric.observe((request_end_time - ttft - request_start_time) / (output_len - 1))
request_latency_per_output_token_metric.observe((request_end_time - request_start_time) / output_len)
if ttft is not None:
ttft_metric.observe(ttft)
Expand Down

0 comments on commit 35397cd

Please sign in to comment.