Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xp/decompse matmul split or matmul gather #25196

Open
wants to merge 53 commits into
base: master
Choose a base branch
from

Conversation

xipingyan
Copy link
Contributor

@xipingyan xipingyan commented Jun 25, 2024

Details:

  • Decompress MatMul+Some Nodes+3Gather->3 (MatMul+Nodes)
  • It can speed up VIT int8 model about 10% on throughput mode.

Tickets:

  • 133080

ceciliapeng2011 and others added 16 commits June 12, 2024 15:33
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Fix "TransposeToReshape" trigger's new problem.

Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
@github-actions github-actions bot added category: IE Tests OpenVINO Test: plugins and common category: CPU OpenVINO CPU plugin category: transformations OpenVINO Runtime library - Transformations labels Jun 25, 2024
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
@xipingyan xipingyan marked this pull request as ready for review July 4, 2024 06:29
Signed-off-by: xipingya <xiping.yan@intel.com>
@yuxu42
Copy link
Contributor

yuxu42 commented Aug 19, 2024

Hi @dmitry-gorokhov could you please review the PR? Thanks!

@xipingyan xipingyan force-pushed the xp/decompse_matmul_split_or_matmul_gather branch from 185b4ce to c8a16f0 Compare August 26, 2024 07:26
…test_utils

Signed-off-by: xipingya <xiping.yan@intel.com>
@wenjiew wenjiew modified the milestones: 2024.4, 2024.5 Aug 30, 2024
return std::string("{") + ov::util::join(s) + "}";
}

void CheckNumberOfNodesWithTypeImpl(std::shared_ptr<const ov::Model> function,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, why was it required to move those functions to a separate file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @EgorDuplensky for reviewing this PR.
1: Path: src/tests/functional/plugin/shared/include/subgraph_tests/ and src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/ share this infrastructure. Avoid to duplicate codes, just move here, as @iefode 's suggestion.
2: I have a special reason to split CheckNumberOfNodesWithTypeImpl, just copy original implementation.

return true;
}

pass::MatmulGatherDecomposition::MatmulGatherDecomposition() {
Copy link
Contributor

@EgorDuplensky EgorDuplensky Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this transformation is trying to match a very specific pattern from llm models but shouldn't we have some heuristic for the weights size or something?
I mean do we expect any model with any weights sizes to benefit from this transformation?
Also, please describe in the commit message / PR description the motivation of having this transformation, why we expect it to speed up llms in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I want to match VIT similar structure model, and I also add some heuristics, check Rank, decompose_num, and specific transpose order, do you think these are not enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this will be enough most of the times.
But, again, this is mostly about the reason we are getting the speed-ups.
I assume we observe speed-ups not because of the ranks, decompose_num and transpose order, but because we become less memory bound. But maybe I am wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it is related to input data size, because it is dynamic shape, so it is hard to custom describe it, just try to best.

auto transpose_pattern =
wrap_type<opset1::Transpose>({reshape_pattern, ov::pass::pattern::wrap_type<ov::opset1::Constant>()},
ov::pass::pattern::consumers_count(decompose_num));
auto reshape2_pattern =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably we can use pattern::optional (https://github.com/openvinotoolkit/openvino/blob/master/src/core/include/openvino/pass/pattern/op/optional.hpp) here to simplify the pattern logic
auto reshape_pattern = wrap_typeopset1::Reshape(...)
auto optional_transpose = pattern::optionalopset1::Transpose(...)
auto reshape2_pattern = wrap_typeopset1::Reshape({optional_transpose, ...}...)

example:

// complex pattern matching with `optional` and `wrap_type`
TEST(pattern, optional_complex_pattern_matching) {
auto model_param = make_shared<op::v0::Parameter>(element::f32, ov::Shape{2, 3, 4});
auto model_constant = make_shared<op::v0::Constant>(element::i32, ov::Shape{3}, std::vector<int>{2, 0, 1});
auto model_abs = make_shared<op::v0::Abs>(model_param);
auto model_transpose_negative = std::make_shared<op::v1::Transpose>(model_abs, model_constant);
auto model_negative = std::make_shared<op::v0::Relu>(model_transpose_negative);
auto model_relu = make_shared<op::v0::Relu>(model_param);
auto model_transpose_positive = std::make_shared<op::v1::Transpose>(model_relu, model_constant);
auto model_positive = std::make_shared<op::v0::Relu>(model_transpose_positive);
auto pattern_param = ov::pass::pattern::any_input();
auto pattern_constant = ov::pass::pattern::wrap_type<ov::op::v0::Constant>();
auto pattern_relu = ov::pass::pattern::wrap_type<ov::op::v0::Relu>({pattern_param});
auto pattern_transpose = ov::pass::pattern::optional<op::v1::Transpose>({pattern_relu, pattern_constant});
auto pattern = ov::pass::pattern::wrap_type<op::v0::Relu>({pattern_transpose});
TestMatcher matcher;
ASSERT_FALSE(matcher.match(pattern, model_negative));
ASSERT_TRUE(matcher.match(pattern, model_positive));
}
TEST(pattern, optional_full_match) {
Shape shape{};
auto model_input = std::make_shared<op::v0::Parameter>(element::i32, shape);
auto model_relu = std::make_shared<op::v0::Relu>(model_input);
auto model_relu1 = std::make_shared<op::v0::Relu>(model_relu->output(0));
auto pattern_relu = ov::pass::pattern::optional<op::v0::Relu>();
auto pattern_relu1 = std::make_shared<op::v0::Relu>(pattern_relu->output(0));
TestMatcher tm;
ASSERT_TRUE(tm.match(pattern_relu1, model_relu1));

Copy link
Contributor Author

@xipingyan xipingyan Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good option. But there is a little special. My case as follow:

Reshape->Transpose->Others
Reshape->Reshape2->Others

Pattern::OR seems to be better. @itikhono

2:Remove opset1.hpp, replace with op::v1 format.

Signed-off-by: xipingya <xiping.yan@intel.com>
src/plugins/intel_cpu/src/transformations/cpu_opset/common/pass/matmul_split_decomposition.hpp
src/plugins/intel_cpu/src/transformations/cpu_opset/common/pass/matmul_split_decomposition.cpp

Signed-off-by: xipingya <xiping.yan@intel.com>
@github-actions github-actions bot removed the category: transformations OpenVINO Runtime library - Transformations label Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin category: IE Tests OpenVINO Test: plugins and common
Projects
None yet
Development

Successfully merging this pull request may close these issues.