-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xp/decompse matmul split or matmul gather #25196
base: master
Are you sure you want to change the base?
Xp/decompse matmul split or matmul gather #25196
Conversation
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Fix "TransposeToReshape" trigger's new problem. Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Hi @dmitry-gorokhov could you please review the PR? Thanks! |
185b4ce
to
c8a16f0
Compare
…test_utils Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
return std::string("{") + ov::util::join(s) + "}"; | ||
} | ||
|
||
void CheckNumberOfNodesWithTypeImpl(std::shared_ptr<const ov::Model> function, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering, why was it required to move those functions to a separate file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @EgorDuplensky for reviewing this PR.
1: Path: src/tests/functional/plugin/shared/include/subgraph_tests/
and src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/
share this infrastructure. Avoid to duplicate codes, just move here, as @iefode 's suggestion.
2: I have a special reason to split CheckNumberOfNodesWithTypeImpl
, just copy original implementation.
return true; | ||
} | ||
|
||
pass::MatmulGatherDecomposition::MatmulGatherDecomposition() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that this transformation is trying to match a very specific pattern from llm models but shouldn't we have some heuristic for the weights size or something?
I mean do we expect any model with any weights sizes to benefit from this transformation?
Also, please describe in the commit message / PR description the motivation of having this transformation, why we expect it to speed up llms in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I want to match VIT similar structure model, and I also add some heuristics, check Rank, decompose_num, and specific transpose order, do you think these are not enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably this will be enough most of the times.
But, again, this is mostly about the reason we are getting the speed-ups.
I assume we observe speed-ups not because of the ranks, decompose_num and transpose order, but because we become less memory bound. But maybe I am wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it is related to input data size, because it is dynamic shape, so it is hard to custom describe it, just try to best.
.../transformations/include/transformations/common_optimizations/matmul_split_decomposition.hpp
Outdated
Show resolved
Hide resolved
.../transformations/include/transformations/common_optimizations/matmul_split_decomposition.hpp
Outdated
Show resolved
Hide resolved
auto transpose_pattern = | ||
wrap_type<opset1::Transpose>({reshape_pattern, ov::pass::pattern::wrap_type<ov::opset1::Constant>()}, | ||
ov::pass::pattern::consumers_count(decompose_num)); | ||
auto reshape2_pattern = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably we can use pattern::optional
(https://github.com/openvinotoolkit/openvino/blob/master/src/core/include/openvino/pass/pattern/op/optional.hpp) here to simplify the pattern logic
auto reshape_pattern = wrap_typeopset1::Reshape(...)
auto optional_transpose = pattern::optionalopset1::Transpose(...)
auto reshape2_pattern = wrap_typeopset1::Reshape({optional_transpose, ...}...)
example:
openvino/src/core/tests/pattern.cpp
Lines 686 to 720 in 7cf0564
// complex pattern matching with `optional` and `wrap_type` | |
TEST(pattern, optional_complex_pattern_matching) { | |
auto model_param = make_shared<op::v0::Parameter>(element::f32, ov::Shape{2, 3, 4}); | |
auto model_constant = make_shared<op::v0::Constant>(element::i32, ov::Shape{3}, std::vector<int>{2, 0, 1}); | |
auto model_abs = make_shared<op::v0::Abs>(model_param); | |
auto model_transpose_negative = std::make_shared<op::v1::Transpose>(model_abs, model_constant); | |
auto model_negative = std::make_shared<op::v0::Relu>(model_transpose_negative); | |
auto model_relu = make_shared<op::v0::Relu>(model_param); | |
auto model_transpose_positive = std::make_shared<op::v1::Transpose>(model_relu, model_constant); | |
auto model_positive = std::make_shared<op::v0::Relu>(model_transpose_positive); | |
auto pattern_param = ov::pass::pattern::any_input(); | |
auto pattern_constant = ov::pass::pattern::wrap_type<ov::op::v0::Constant>(); | |
auto pattern_relu = ov::pass::pattern::wrap_type<ov::op::v0::Relu>({pattern_param}); | |
auto pattern_transpose = ov::pass::pattern::optional<op::v1::Transpose>({pattern_relu, pattern_constant}); | |
auto pattern = ov::pass::pattern::wrap_type<op::v0::Relu>({pattern_transpose}); | |
TestMatcher matcher; | |
ASSERT_FALSE(matcher.match(pattern, model_negative)); | |
ASSERT_TRUE(matcher.match(pattern, model_positive)); | |
} | |
TEST(pattern, optional_full_match) { | |
Shape shape{}; | |
auto model_input = std::make_shared<op::v0::Parameter>(element::i32, shape); | |
auto model_relu = std::make_shared<op::v0::Relu>(model_input); | |
auto model_relu1 = std::make_shared<op::v0::Relu>(model_relu->output(0)); | |
auto pattern_relu = ov::pass::pattern::optional<op::v0::Relu>(); | |
auto pattern_relu1 = std::make_shared<op::v0::Relu>(pattern_relu->output(0)); | |
TestMatcher tm; | |
ASSERT_TRUE(tm.match(pattern_relu1, model_relu1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is good option. But there is a little special. My case as follow:
Reshape->Transpose->Others
Reshape->Reshape2->Others
Pattern::OR seems to be better. @itikhono
...mmon/transformations/src/transformations/common_optimizations/matmul_split_decomposition.cpp
Outdated
Show resolved
Hide resolved
...mmon/transformations/src/transformations/common_optimizations/matmul_split_decomposition.cpp
Outdated
Show resolved
Hide resolved
2:Remove opset1.hpp, replace with op::v1 format. Signed-off-by: xipingya <xiping.yan@intel.com>
src/plugins/intel_cpu/src/transformations/cpu_opset/common/pass/matmul_split_decomposition.hpp src/plugins/intel_cpu/src/transformations/cpu_opset/common/pass/matmul_split_decomposition.cpp Signed-off-by: xipingya <xiping.yan@intel.com>
Details:
Tickets: