Add conv fp16 kernel in xnnpack EP #22301

mszhanyi · 2024-10-03T11:54:42Z

Description

Add FP16 kernels of Conv and ConvTranspose

Motivation and Context

onnxruntime/core/providers/xnnpack/detail/utils.cc

onnxruntime/core/providers/xnnpack/nn/conv.cc

skottmckay · 2024-10-04T06:05:01Z

onnxruntime/core/providers/xnnpack/nn/conv_base.cc

+    const float output_min = -65504.0;
+    const float output_max = 65504.0;


Where are these values coming from? I would have expected we use something based on foutput_min/foutput_max so any clip parameters (from a fusion of two nodes) are honoured.

It's calculated by FP16 format, 1 sign bit, 5 exponent bits and 11 mantissa bits.

I just checked that tensorflow is using 65504 directly

# Note 65504. is the max float16 value. if scores.dtype is dtypes.float16: scores -= 65504. * math_ops.cast(padding_mask, dtype=scores.dtype)

https://github.com/tensorflow/tensorflow/blob/47dc9d146e99f5180906d8bd1b0c0291fa947d23/tensorflow/python/keras/layers/dense_attention.py#L126

And it looks that we can't get the FP16 max/min value by std::numeric_limits like u8s8

updated to const auto output_min = clip_min_max ? onnxruntime::math::floatToHalf(clip_min_max->first) : -65504.0;

onnxruntime/core/providers/xnnpack/nn/conv_base.cc

onnxruntime/core/providers/xnnpack/nn/conv_transpose.cc

onnxruntime/test/providers/checkers.cc

This reverts commit 9c97d4a.

Ubuntu added 3 commits October 3, 2024 11:53

add conv fp16 kernel in xnnpack

7841444

fix lint

d4a863d

add missing changes

abbacdb

mszhanyi marked this pull request as draft October 3, 2024 13:13

Ubuntu added 2 commits October 3, 2024 13:37

add convtranspose

e84f9eb

fix lint

e33b780

mszhanyi marked this pull request as ready for review October 3, 2024 14:28

Ubuntu added 3 commits October 4, 2024 03:01

update torlerance

582c3c3

add one comment

72f69a9

update

85f8b9a

mszhanyi requested review from skottmckay and wejoncy October 4, 2024 05:40

skottmckay reviewed Oct 4, 2024

View reviewed changes

Ubuntu added 7 commits October 4, 2024 09:02

fix lint

d0b507a

improve compute type helper function

c824655

temp change

9c97d4a

Revert "temp change"

d2c7c2f

This reverts commit 9c97d4a.

add tolerance

4dcb746

lint

862daf6

F16 max value comment

56d0031

mszhanyi requested a review from skottmckay October 7, 2024 02:17

mszhanyi marked this pull request as draft October 7, 2024 03:50

lint

ebf5bf7

mszhanyi marked this pull request as ready for review October 7, 2024 04:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add conv fp16 kernel in xnnpack EP #22301

Add conv fp16 kernel in xnnpack EP #22301

mszhanyi commented Oct 3, 2024 •

edited

Loading

skottmckay Oct 4, 2024

mszhanyi Oct 7, 2024

mszhanyi Oct 7, 2024 •

edited

Loading

mszhanyi Oct 7, 2024

mszhanyi Oct 7, 2024

		const float output_min = -65504.0;
		const float output_max = 65504.0;

Add conv fp16 kernel in xnnpack EP #22301

Are you sure you want to change the base?

Add conv fp16 kernel in xnnpack EP #22301

Conversation

mszhanyi commented Oct 3, 2024 • edited Loading

Description

Motivation and Context

skottmckay Oct 4, 2024

Choose a reason for hiding this comment

mszhanyi Oct 7, 2024

Choose a reason for hiding this comment

mszhanyi Oct 7, 2024 • edited Loading

Choose a reason for hiding this comment

mszhanyi Oct 7, 2024

Choose a reason for hiding this comment

mszhanyi Oct 7, 2024

Choose a reason for hiding this comment

mszhanyi commented Oct 3, 2024 •

edited

Loading

mszhanyi Oct 7, 2024 •

edited

Loading