-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Hackathon 7th PPSCI No.12】Adam、AdamW 优化器支持 amsgrad -part #68079
Open
megemini
wants to merge
45
commits into
PaddlePaddle:develop
Choose a base branch
from
megemini:hack7_amsgrad
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
45 commits
Select commit
Hold shift + click to select a range
b45f2c4
[init] amsgrad
megemini 640be9b
[update] refer.h
megemini 2028825
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini caf919a
[Add] amsgrad gpu
megemini aa289ad
[Add] amsgrad for adamw and fused
megemini 106f817
[Fix] adamw gpu kernel
megemini fddb46a
[Update] fused adam kernel for gpu
megemini d206442
[Update] xpu adam/adamw param list
megemini 8cc9b5b
[Update] tests for amsgrad
megemini eb5de54
[Fix] moment2 max out settting values without amsgrad
megemini 7aa9d60
[Update] unittest passed for adam and adamw
megemini 96216e4
[Update] unittest passed for merged and fused amda
megemini 7398a2f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini 98abe71
[Update] make moment2_max optional
megemini e159b70
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini 4564d32
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini e2d2f9b
[Update] test_adamw_op.py with new test cast
megemini 7d7ddb1
[Update] adam adamw with amsgrad formula
megemini d8d97ed
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini fc6204f
[Update] adam/adamw for test.cc
megemini 0144890
[Fix] xpu param name
megemini c6942c0
[Fix] xpu param name & unittest
megemini f9cb32e
[Fix] xpu param type
megemini 92ad89d
[Fix] xpu unittest
megemini 8e026cd
[Fix] xpu unittest
megemini 56d26df
[Fix] xpu unittest
megemini 26c7e63
[Fix] merged_adam_ op_compat.yaml
megemini 5aa6c40
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini ddb2035
[Fix] remove UNUSED
megemini e41b66b
[Fix] remove UNUSED
megemini a751804
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini 1f2831a
[Update] unittest adam op
megemini cfbd173
[Fix] op_compat.yaml
megemini d371c41
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini 9f977ac
[Update] assembly for adam adamw
megemini 1f74eb8
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini d6e2652
[Fix] adamw.cc for assembly jit gen
megemini da2e743
[Update] adam with old ir test
megemini 1b9a6bf
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini d157301
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini 1c05064
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini 6544a48
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini f17d737
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
megemini af27337
[Update] codestyle
megemini d7bb19a
[Update] npu test rtol adamw
megemini File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -57,6 +57,9 @@ class FusedAdamOpMaker : public framework::OpProtoAndCheckerMaker { | |
AddInput("LearningRate", "(Tensor, default Tensor<float>) Learning rate"); | ||
AddInput("Moments1", "(Tensor) Input first moments").AsDuplicable(); | ||
AddInput("Moments2", "(Tensor) Input second moments").AsDuplicable(); | ||
AddInput("Moments2Max", "(Tensor) Input second moments max for amsgrad") | ||
.AsDispensable() | ||
.AsDuplicable(); | ||
AddInput("Beta1Pows", | ||
"(Tensor, default Tensor<float>) Input beta1 power accumulator") | ||
.AsDuplicable(); | ||
|
@@ -72,6 +75,10 @@ class FusedAdamOpMaker : public framework::OpProtoAndCheckerMaker { | |
AddOutput("ParamsOut", "(Tensor) Output parameters").AsDuplicable(); | ||
AddOutput("Moments1Out", "(Tensor) Output first moments").AsDuplicable(); | ||
AddOutput("Moments2Out", "(Tensor) Output second moments").AsDuplicable(); | ||
AddOutput("Moments2MaxOut", | ||
"(Tensor) Output second moments max for amsgrad") | ||
.AsDispensable() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 同上 |
||
.AsDuplicable(); | ||
AddOutput("Beta1PowsOut", "(Tensor) Output beta1 power accumulator") | ||
.AsDuplicable(); | ||
AddOutput("Beta2PowsOut", "(Tensor) Output beta2 power accumulator") | ||
|
@@ -122,6 +129,10 @@ class FusedAdamOpMaker : public framework::OpProtoAndCheckerMaker { | |
"Whether to use global beta_pow for whole model instead of " | ||
"creating beta_pow for each parameter.") | ||
.SetDefault(false); | ||
AddAttr<bool>("amsgrad", | ||
"(bool, default false) " | ||
"Whether to use the AMSGrad of this algorithm.") | ||
.SetDefault(false); | ||
|
||
AddComment(R"DOC( | ||
Adam Optimizer. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里改变了原有算子的签名,将会是不兼容升级。不知道是否符合我们的期望。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有些模型可能已经按照原来的协议保存下来了,如果这里修改后,原来save的模型可能无法加载。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯,我看 https://github.com/PaddlePaddle/Paddle/wiki/OP-Input-Output-Attribute-Compatibility-Modification 是要求写
AsDispensable
~ 这样也不行?