[Fix] in Xnnpack EP, the conversion for fused activation param isn't correct #23115

mszhanyi · 2024-12-16T03:42:34Z

Description

In Xnnpack EP, the activation_param's conversion isn't correct for Fp16 model
Sometimes, it may cause an exception that "lower bound must be below upper bound"
Because CPU EP doesn't support FP16 activation fusion now, so the newly added test skips the comparison of the test result.

Motivation and Context

Test Cases

2024-12-23T09:17:39.5038317Z [ RUN      ] XnnpackEP.TestNhwcConvReluFusion_FP16
2024-12-23T09:17:39.5079188Z �[0;93m2024-12-23 09:17:39.505334389 [W:onnxruntime:TestNhwcConvReluFusion_FP16, session_state.cc:1263 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.�[m
2024-12-23T09:17:39.5080635Z �[0;93m2024-12-23 09:17:39.505405629 [W:onnxruntime:TestNhwcConvReluFusion_FP16, session_state.cc:1265 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.�[m
2024-12-23T09:17:39.5099292Z [       OK ] XnnpackEP.TestNhwcConvReluFusion_FP16 (5 ms)
2024-12-23T09:17:39.5100453Z [ RUN      ] XnnpackEP.TestNhwcConvReluClipFusion_FP16
2024-12-23T09:17:39.5145494Z [       OK ] XnnpackEP.TestNhwcConvReluClipFusion_FP16 (5 ms)

Linux XnnPack on ARM64
https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1576767&view=logs&j=317a4805-d006-5aac-2a82-238793a22a22&t=5481769e-95ca-5a84-acab-996d49e28b59

wejoncy · 2024-12-17T03:16:15Z

onnxruntime/core/providers/xnnpack/detail/utils.cc

-                               ? *reinterpret_cast<const float*>(value.raw_data().data())
-                               : value.float_data()[0];
+            int32_t arg_type;
+            if (GetType(arg, arg_type) && arg_type == ONNX_NAMESPACE::TensorProto_DataType_FLOAT16) {


What if GetType(arg, arg_type) failed here?

Generally type info is always available, so I think this is ok. Shape info may be missing depending on the model.

The Conv op looks to be setup to allow fp32, u8, s8 and optionally fp16. Should this also handle u8 and s8 or should ClipReluChecker limit fusion to fp32 and fp16?

So far, core runtime Clip fusion only supports float too.

onnxruntime/onnxruntime/core/optimizer/utils.cc

Lines 335 to 349 in c6ba7ed

if (initializer) {

Initializer i(*initializer, graph.ModelPath());

switch (initializer->data_type()) {

case ONNX_NAMESPACE::TensorProto_DataType_FLOAT:

value = *i.data<float>();

break;

// double isn't currently supported

// case ONNX_NAMESPACE::TensorProto_DataType_DOUBLE:

// value = static_cast<float>(*i.data<double>());

// break;

case ONNX_NAMESPACE::TensorProto_DataType_FLOAT16:

value = math::halfToFloat(i.data<MLFloat16>()->val);

break;

default:

ORT_THROW("Unexpected data type for Clip input of ", initializer->data_type());

.
Shall we update them together?

I'd leave the core Clip fusion as-is for now. Can be a separate PR if we think there's a use-case that would benefit.

Are you planning on updating ClipReluChecker to limit the types?

I may need more time to understand ClipQuantFusion
https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/optimizer/qdq_transformer/clip_quantizelinear.cc
But for the known reason, I have no idea about the next plan.

I think ClipQuantFusion is a separate topic as that's about ignoring a Clip or Relu when the Q zp and scale make it redundant.

I was asking if the XNNPACK EP ClipReluChecker needs to be updated to either limit the types it allows, or whether FuseActivation needs to handle u8 or s8 input for the Clip min/max.

This has no checks on types:

onnxruntime/onnxruntime/core/providers/xnnpack/detail/node_support_checker.cc

Lines 42 to 44 in 2d05c4b

const NodeUnit* ClipReluChecker(const NodeUnit& node_unit,

const GraphViewer& graph,

const std::unordered_map<const Node*, const NodeUnit*>& supported_node_unit_map) {

But FuseActivation always uses a float in the activation params and with this PR is explicitly only checking for fp32 and fp16.

e.g. if there's a Conv node with u8 or s8 input it looks like ClipReluChecker will allow the activation, but FuseActivation won't do the right thing as the Clip min/max would be u8 or s8.

I checked https://onnx.ai/onnx/operators/onnx__Conv.html#type-constraints, Onnx Conv node shouldn't have u8 or s8 inputs.

skottmckay · 2024-12-17T05:59:34Z

onnxruntime/test/providers/xnnpack/xnnpack_basic_test.cc

+  // So far, CPU EP doensn't support Fp16 Conv fusion, so verify_outputs is skipped.
+  RunAndVerifyOutputsWithEP(ort_model_path, "TestNhwcConvReluClipFusion_FP16", std::move(ep), feeds, params, {}, false);


Not quite following. There should still be valid output from the CPU EP even if it doesn't fuse, so why can't we use verify_outputs?

Suggested change

// So far, CPU EP doensn't support Fp16 Conv fusion, so verify_outputs is skipped.

RunAndVerifyOutputsWithEP(ort_model_path, "TestNhwcConvReluClipFusion_FP16", std::move(ep), feeds, params, {}, false);

// So far, CPU EP doesn't support Fp16 Conv fusion, so verify_outputs is skipped.

RunAndVerifyOutputsWithEP(ort_model_path, "TestNhwcConvReluClipFusion_FP16", std::move(ep), feeds, params, {}, false);

So far, CPU EP doesn't implement FP16 Clip fusion. The output verification fails because it looks CPU EP falls back to FP32 Clip.

onnxruntime/onnxruntime/core/providers/cpu/fp16/fp16_activations.h

Lines 74 to 77 in e0e8304

// TODO Add the following activations:

// MlasTanhActivation,

// MlasLogisticActivation,

// MlasClipActivation,

To verify the Xnnpack FP16 conv fusion correctness, I add a new test with a new FP16 model ( with only Conv+Relu).
Current test (Conv+Clip+Relu) is kept because I want to make sure that Conv+Clip fusion can run, that is, the activition parameters are added correctly.

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/core/providers/cpu/fp16/fp16_activations.h

…zhanyi/activationparam

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/core/providers/xnnpack/detail/utils.cc

Yi Zhang added 5 commits December 16, 2024 11:37

fix activate parameter in fp16

ba52bc0

add test data

6032820

rm useless change

242c182

node assignment some for FP16

7c7f16a

update

3d75696

mszhanyi requested review from skottmckay, snnn and wejoncy December 16, 2024 08:21

mszhanyi marked this pull request as draft December 16, 2024 13:06

Yi Zhang added 2 commits December 16, 2024 22:31

update

c4f0455

head file

dd9865f

mszhanyi marked this pull request as ready for review December 17, 2024 02:13

wejoncy reviewed Dec 17, 2024

View reviewed changes

skottmckay reviewed Dec 17, 2024

View reviewed changes

mszhanyi marked this pull request as draft December 20, 2024 02:53

update

d556acb

github-actions bot reviewed Dec 23, 2024

View reviewed changes

onnxruntime/core/providers/cpu/fp16/fp16_activations.h Outdated Show resolved Hide resolved

Yi Zhang added 7 commits December 23, 2024 16:07

update1

a4dac51

rename

ee98190

typo and lint

52d099a

revert some changes

3cc345d

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

67aa30c

…zhanyi/activationparam

fix

0baa34b

typo

e0e8304

mszhanyi marked this pull request as ready for review December 23, 2024 10:02

mszhanyi requested review from wejoncy and skottmckay December 23, 2024 10:03

update

f1d3b16

github-actions bot reviewed Dec 25, 2024

View reviewed changes

onnxruntime/core/providers/xnnpack/detail/utils.cc Outdated Show resolved Hide resolved

lint

042e5cd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] in Xnnpack EP, the conversion for fused activation param isn't correct #23115

[Fix] in Xnnpack EP, the conversion for fused activation param isn't correct #23115

mszhanyi commented Dec 16, 2024 •

edited

Loading

wejoncy Dec 17, 2024

skottmckay Dec 17, 2024

mszhanyi Dec 23, 2024 •

edited

Loading

mszhanyi Dec 23, 2024

skottmckay Dec 24, 2024

mszhanyi Dec 26, 2024

skottmckay Dec 30, 2024

mszhanyi Jan 1, 2025

skottmckay Dec 17, 2024

mszhanyi Dec 23, 2024

mszhanyi Dec 23, 2024 •

edited

Loading

github-actions bot left a comment

github-actions bot left a comment

	if (initializer) {
	Initializer i(*initializer, graph.ModelPath());
	switch (initializer->data_type()) {
	case ONNX_NAMESPACE::TensorProto_DataType_FLOAT:
	value = *i.data<float>();
	break;
	// double isn't currently supported
	// case ONNX_NAMESPACE::TensorProto_DataType_DOUBLE:
	// value = static_cast<float>(*i.data<double>());
	// break;
	case ONNX_NAMESPACE::TensorProto_DataType_FLOAT16:
	value = math::halfToFloat(i.data<MLFloat16>()->val);
	break;
	default:
	ORT_THROW("Unexpected data type for Clip input of ", initializer->data_type());

	const NodeUnit* ClipReluChecker(const NodeUnit& node_unit,
	const GraphViewer& graph,
	const std::unordered_map<const Node, const NodeUnit>& supported_node_unit_map) {

		// So far, CPU EP doensn't support Fp16 Conv fusion, so verify_outputs is skipped.
		RunAndVerifyOutputsWithEP(ort_model_path, "TestNhwcConvReluClipFusion_FP16", std::move(ep), feeds, params, {}, false);

	// TODO Add the following activations:
	// MlasTanhActivation,
	// MlasLogisticActivation,
	// MlasClipActivation,

[Fix] in Xnnpack EP, the conversion for fused activation param isn't correct #23115

Are you sure you want to change the base?

[Fix] in Xnnpack EP, the conversion for fused activation param isn't correct #23115

Conversation

mszhanyi commented Dec 16, 2024 • edited Loading

Description

Motivation and Context

Test Cases

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mszhanyi Dec 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mszhanyi Dec 23, 2024 • edited Loading

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

mszhanyi commented Dec 16, 2024 •

edited

Loading

mszhanyi Dec 23, 2024 •

edited

Loading

mszhanyi Dec 23, 2024 •

edited

Loading