[Snippets] Support Fully Connected tokenization #26498

v-Golubev · 2024-09-09T16:41:26Z

Details:

Support FullyConnected tokenization in tests
Matmul tests: added MatMul with constant instances

Tickets:

CVS-150965

v-Golubev · 2024-09-19T14:09:37Z

src/common/snippets/src/pass/collapse_subgraph.cpp

@@ -249,14 +218,10 @@ bool TokenizeSnippets::AppropriateForSubgraph(const std::shared_ptr<const Node>

 TokenizeSnippets::TokenizeSnippets(const SnippetsTokenization::Config& config) {
    MATCHER_SCOPE(TokenizeSnippets);
-    enum continuation_strategy {


Note: tokenization code is moved without changes to tokenization_utils.cpp file with one exception: enum continuation_strategy is removed since it was used only with one value: continuation_strategy::reset

…::operator<< using

a-sidorova · 2024-09-25T08:01:38Z

src/common/snippets/src/pass/fc_tokenization.cpp

+        if (transformation_callback(matmul)) {
+            return false;
+        }
+        return ov::snippets::utils::tokenize_node(matmul, config);


Do we really need to extract the common tokenization logic to utilities just to tokenize one op?
Can we use the following code here?

auto subgraph = op::Subgraph::wrap_node_as_subgraph(matmul); subgraph->get_rt_info()["originalLayersNames"] = matmul->get_friendly_name(); ov::replace_node(matmul, subgraph); op::update_out_tensor_name(subgraph);

Probably I missed something 🤔 Could you please elaborate your decision with ov::snippets::utils::tokenize_node?

The only one thought is to connected to the existing Subgraphs on first input 🤔

But why not?
It promotes code reusage and thus simplifies maintenance. We would also be able to extend this pass to cover MLP patterns more easily.
I agree that the tokenize_node itself is quite complex, but that's because we have a bunch of checks there (like the ones related to Converts or num registers) that shouldn't really be in this function.

The only one thought is to connected to the existing Subgraphs on first input

You are right, that is one of the reasons why I reused the helper for FC tokenization. Thanks to that, we cover e.g. "Transposes on inputs + FC" cases by tests

Also, the moving of tokenization logic in a separate helper is a good prerequisite for Tokenization refactoring. Currently, the tokenization looks as follows (I cut the code a bit):

auto tokenization_passes = manager.register_pass<ov::pass::GraphRewrite>(); tokenization_passes->add_matcher<TokenizeGNSnippets>(); tokenization_passes->add_matcher<TokenizeFCSnippets>(m_config); tokenization_passes->add_matcher<TokenizeSnippets>(m_config);

TokenizeSnippets here is responsible for tokenization of the ops which were not tokenized by the previous passes. The pass contain a lot of checks, which are related to different ops, in one place (is_supported_op lambda): as a result, the pass looks quite cumbersome.

Ideally, I would want to separate this large pass into sequence of small matcher passes, which will match only on the specific ops, and contain only the checks which are needed for the matched ops:

auto tokenization_passes = manager.register_pass<ov::pass::GraphRewrite>(); tokenization_passes->add_matcher<TokenizeGNSnippets>(); tokenization_passes->add_matcher<TokenizeFCSnippets>(m_config); tokenization_passes->add_matcher<TokenizeUnaryEltwise>(m_config); tokenization_passes->add_matcher<TokenizeBinaryEltwise>(m_config); tokenization_passes->add_matcher<TokenizeTranspose>(m_config); tokenization_passes->add_matcher<TokenizeMatmul>(m_config); ...

I believe this will improve tokenization code readability, and could allow us to configure the pipeline more precisely (e.g. TokenizeTranspose and TokenizeMatmul are used only in tests, so they can be disabled by the plugin in product code).

And each of this new matchers can reuse ov::snippets::utils::tokenize_node -- that's another reason why I moved the tokenization code in a separate helper.

You are right, that is one of the reasons why I reused the helper for FC tokenization. Thanks to that, we cover e.g. "Transposes on inputs + FC" cases by tests

Cool! Thank you for the explanation! I just wanted to check that I fully got the goal of this PR 😊
Yeah, I saw some tests like FC->Eltwise->FC.

Ideally, I would want to separate this large pass into sequence of small matcher passes, which will match only on the specific ops, and contain only the checks which are needed for the matched ops

This is really good idea! 🤔

src/common/snippets/src/pass/fc_tokenization.cpp

a-sidorova · 2024-09-25T08:06:07Z

src/common/snippets/include/snippets/utils/tokenization_utils.hpp

+ * @param config tokenization config which regulates 
+ * @return whether the node was tokenized or not
+ */
+bool tokenize_node(const std::shared_ptr<ov::Node>& node, const ov::snippets::pass::SnippetsTokenization::Config& config);


Probably we need to create the static public method in the pass "TokenizeSnippets" instead of new file creation with one function. What do you think?

I decided to move this in a separate file because

This is a common helper which can be reused in numerous tokenization passes in the future

The helper is large, so it's a bit easier to have it in a separate file from readability perspective (At least that's what it seems to me 😄 )

This file may be extended with bool tokenize_nodes(const ov::NodeVector& nodes) helper which tokenizes a nodes sequence. Now this is a part of TokenizeMHASnippets pass, but this code can be easily extracted and reused (e.g. for MLP tokenization)

We can also move AppropriateForSubgraph here 😁
In general, whether we want it or not, we are moving from one universal tokenization pass towards several transformations responsible for particular pattern families (eltwise/MHA/MLP).
From this perspective, this attempt to derive and reuse some common logic seems like a step in the right direction.

I think we should go even further and create a dedicated class for tokenization (SnippetsNodeTokenizer or smth). This class should have 3 methods: private bool check_necessary(node) public virtual bool check_sufficient(node) and public bool fuse(node).

check_necessary will perform all the fundamental plugin-independent checks (like cyclic dependencies, num_results_children etc) and can potentially also gather some aggregated info (internal/external inputs/body parameters etc). If possible, we should separate the check from aggregated info gathering (another method analyze).

check_sufficient will be overrided by plugin and will contain backend-specific checks possibly partially based on the gathered aggregated info (hidden_data_count, unique_biffer_count) etc

Finally, fuse will call if(check_necessary() && check_sufficient()) and create subgraph based on aggregated info.
So the idea is that the plugin creates its own instance of this SnippetsNodeTokenizer and passes it to all the tokenization transformations. And the transformations can use this class to do all the dirty work, plus they can impose their own limitations of course. For example, the eltwise tokenization pass will additionally check op types and precisions.

@a-sidorova, @v-Golubev what do you think? I'm nit saying we should refactor it know, but let's try to outline a more convenient and scalable design.

@IvanNovoselov your idea looks interesting, but I need some time to think if the proposed architecture is enough. Can we allocate a time slot in the next command meeting to discuss it?

Seems like we really need to discuss how the current logic of the tokenization can be refactored and improved.

Yes, sure let's discuss it on the sync

src/common/snippets/src/pass/tokenization.cpp

src/common/snippets/src/utils/tokenization_utils.cpp

src/tests/functional/plugin/shared/src/snippets/matmul.cpp

src/tests/ov_helpers/ov_snippets_models/src/subgraph_matmul.cpp

src/common/snippets/src/pass/fc_tokenization.cpp

IvanNovoselov · 2024-09-25T09:58:50Z

src/common/snippets/src/pass/fc_tokenization.cpp

+        if (transformation_callback(matmul)) {
+            return false;
+        }
+        return ov::snippets::utils::tokenize_node(matmul, config);


But why not?
It promotes code reusage and thus simplifies maintenance. We would also be able to extend this pass to cover MLP patterns more easily.
I agree that the tokenize_node itself is quite complex, but that's because we have a bunch of checks there (like the ones related to Converts or num registers) that shouldn't really be in this function.

IvanNovoselov · 2024-09-25T10:15:52Z

src/common/snippets/include/snippets/utils/tokenization_utils.hpp

+ * @param config tokenization config which regulates 
+ * @return whether the node was tokenized or not
+ */
+bool tokenize_node(const std::shared_ptr<ov::Node>& node, const ov::snippets::pass::SnippetsTokenization::Config& config);


We can also move AppropriateForSubgraph here 😁
In general, whether we want it or not, we are moving from one universal tokenization pass towards several transformations responsible for particular pattern families (eltwise/MHA/MLP).
From this perspective, this attempt to derive and reuse some common logic seems like a step in the right direction.

IvanNovoselov · 2024-09-25T11:26:18Z

src/common/snippets/include/snippets/utils/tokenization_utils.hpp

+ * @param config tokenization config which regulates 
+ * @return whether the node was tokenized or not
+ */
+bool tokenize_node(const std::shared_ptr<ov::Node>& node, const ov::snippets::pass::SnippetsTokenization::Config& config);


I think we should go even further and create a dedicated class for tokenization (SnippetsNodeTokenizer or smth). This class should have 3 methods: private bool check_necessary(node) public virtual bool check_sufficient(node) and public bool fuse(node).

check_necessary will perform all the fundamental plugin-independent checks (like cyclic dependencies, num_results_children etc) and can potentially also gather some aggregated info (internal/external inputs/body parameters etc). If possible, we should separate the check from aggregated info gathering (another method analyze).

check_sufficient will be overrided by plugin and will contain backend-specific checks possibly partially based on the gathered aggregated info (hidden_data_count, unique_biffer_count) etc

Finally, fuse will call if(check_necessary() && check_sufficient()) and create subgraph based on aggregated info.
So the idea is that the plugin creates its own instance of this SnippetsNodeTokenizer and passes it to all the tokenization transformations. And the transformations can use this class to do all the dirty work, plus they can impose their own limitations of course. For example, the eltwise tokenization pass will additionally check op types and precisions.

@a-sidorova, @v-Golubev what do you think? I'm nit saying we should refactor it know, but let's try to outline a more convenient and scalable design.

IvanNovoselov · 2024-09-25T11:28:35Z

src/common/snippets/src/utils/tokenization_utils.cpp

+            // We cannot add new node, that is not Convert, after Convert (that is start node) to avoid arithmetic problems with conversion
+            // We can add any new node in Subgraph after Convert (bacause after Input)
+            //              Parameter
+            //                  |
+            //               Convert
+            //
+            // We cannot add new node, that isn't Convert, in Subgraph after existing Convert
+            //              Parameter
+            //                Relu
+            //               Convert
+            //
+            // But we can add new Convert in Subgraph after existing Convert
+            //              Parameter
+            //                Relu
+            //               Convert
+            //               Convert
+            //


It looks like this logic is obsolete, because we should be able to support Converts everywhere. Do you think we can remove it?
CC @a-sidorova, we discussed it offline

I think that we can indeed remove this logic. However, since this code is sensitive and affect tokenization of all models, I especially tried to avoid any changes in this helper.
I suggest merging this PR without the proposed changes, and do it in a separate PR which will contain only this change: this would save our time on root-cause investigation in case of unexpected degradations

Ok, fair enough

The PR is created: #26843

a-sidorova

👍🏼

a-sidorova · 2024-09-26T05:03:57Z

src/common/snippets/src/pass/fc_tokenization.cpp

+        if (transformation_callback(matmul)) {
+            return false;
+        }
+        return ov::snippets::utils::tokenize_node(matmul, config);


You are right, that is one of the reasons why I reused the helper for FC tokenization. Thanks to that, we cover e.g. "Transposes on inputs + FC" cases by tests

Cool! Thank you for the explanation! I just wanted to check that I fully got the goal of this PR 😊
Yeah, I saw some tests like FC->Eltwise->FC.

Ideally, I would want to separate this large pass into sequence of small matcher passes, which will match only on the specific ops, and contain only the checks which are needed for the matched ops

This is really good idea! 🤔

a-sidorova · 2024-09-26T05:05:20Z

src/common/snippets/include/snippets/utils/tokenization_utils.hpp

+ * @param config tokenization config which regulates 
+ * @return whether the node was tokenized or not
+ */
+bool tokenize_node(const std::shared_ptr<ov::Node>& node, const ov::snippets::pass::SnippetsTokenization::Config& config);


Seems like we really need to discuss how the current logic of the tokenization can be refactored and improved.

src/common/snippets/src/pass/tokenization.cpp

src/plugins/intel_cpu/tests/functional/shared_tests_instances/snippets/matmul.cpp

v-Golubev added the do_not_review label Sep 9, 2024

v-Golubev requested review from a team as code owners September 9, 2024 16:41

github-actions bot added category: IE Tests OpenVINO Test: plugins and common category: CPU OpenVINO CPU plugin labels Sep 9, 2024

v-Golubev force-pushed the vg/snippets/fc_tokenization branch 3 times, most recently from 77f3cee to 241ed9a Compare September 10, 2024 20:22

v-Golubev requested a review from a team as a code owner September 10, 2024 20:22

v-Golubev requested review from itikhono and removed request for a team September 10, 2024 20:22

github-actions bot added the category: transformations OpenVINO Runtime library - Transformations label Sep 10, 2024

v-Golubev force-pushed the vg/snippets/fc_tokenization branch from 241ed9a to 39805e0 Compare September 11, 2024 17:13

github-actions bot removed the category: transformations OpenVINO Runtime library - Transformations label Sep 11, 2024

v-Golubev force-pushed the vg/snippets/fc_tokenization branch 10 times, most recently from ded748e to c86c9ae Compare September 18, 2024 14:42

v-Golubev removed the do_not_review label Sep 18, 2024

v-Golubev added this to the 2024.5 milestone Sep 18, 2024

v-Golubev assigned a-sidorova and IvanNovoselov Sep 18, 2024

v-Golubev force-pushed the vg/snippets/fc_tokenization branch from f8b7baa to 30cb278 Compare September 19, 2024 14:06

v-Golubev commented Sep 19, 2024

View reviewed changes

v-Golubev force-pushed the vg/snippets/fc_tokenization branch from 50218af to 4d252cf Compare September 22, 2024 21:23

v-Golubev added 9 commits September 22, 2024 23:25

[TESTS] Move SnippetsTestsCommon to snippets namespace & add ov::test…

e3e30cf

…::operator<< using

[Snippets] FullyConnected test instances are added to matmul tests

51f26d8

Add MatMulEltwiseChain tests

39b477f

FCTokenization pass

81a1989

[Snippets] Node tokenization code is moved to a separate helper

54bb63f

Fixes

6683467

Fix Brgemm shape inference

bbfd959

typo

4d252cf

try to avoid internal compiler error

00d2b90

a-sidorova reviewed Sep 25, 2024

View reviewed changes

IvanNovoselov reviewed Sep 25, 2024

View reviewed changes

v-Golubev force-pushed the vg/snippets/fc_tokenization branch from fcec6a7 to 44e463d Compare September 25, 2024 11:54

v-Golubev added 2 commits September 25, 2024 13:56

Alexandra's comments applied

44e463d

Added ticket number

c7bd877

v-Golubev requested review from a-sidorova and IvanNovoselov September 25, 2024 13:09

a-sidorova approved these changes Sep 26, 2024

View reviewed changes

IvanNovoselov approved these changes Sep 27, 2024

View reviewed changes

IvanNovoselov added this pull request to the merge queue Sep 27, 2024

Merged via the queue into openvinotoolkit:master with commit 44866a7 Sep 27, 2024
157 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Snippets] Support Fully Connected tokenization #26498

[Snippets] Support Fully Connected tokenization #26498

v-Golubev commented Sep 9, 2024 •

edited

Loading

v-Golubev Sep 19, 2024

a-sidorova Sep 25, 2024

IvanNovoselov Sep 25, 2024

v-Golubev Sep 25, 2024

a-sidorova Sep 26, 2024

a-sidorova Sep 25, 2024

v-Golubev Sep 25, 2024

IvanNovoselov Sep 25, 2024

IvanNovoselov Sep 25, 2024

v-Golubev Sep 25, 2024

a-sidorova Sep 26, 2024

IvanNovoselov Sep 27, 2024

IvanNovoselov Sep 25, 2024

IvanNovoselov Sep 25, 2024

IvanNovoselov Sep 25, 2024

IvanNovoselov Sep 25, 2024

v-Golubev Sep 25, 2024

IvanNovoselov Sep 27, 2024

v-Golubev Sep 30, 2024

a-sidorova left a comment

a-sidorova Sep 26, 2024

a-sidorova Sep 26, 2024

[Snippets] Support Fully Connected tokenization #26498

[Snippets] Support Fully Connected tokenization #26498

Conversation

v-Golubev commented Sep 9, 2024 • edited Loading

Details:

Tickets:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-sidorova left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

v-Golubev commented Sep 9, 2024 •

edited

Loading