Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 7th PPSCI No.12】Adam、AdamW 优化器支持 amsgrad #949

Merged
merged 2 commits into from
Sep 4, 2024

Conversation

megemini
Copy link
Contributor

@megemini megemini commented Aug 29, 2024

PR types

Others

PR changes

Docs

Description

【Hackathon 7th No.12】Adam、AdamW 优化器支持 amsgrad 相关设计文档

请评审 ~

Copy link

paddle-bot bot commented Aug 29, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备,具体请参考示例模版
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

@megemini
Copy link
Contributor Author

megemini commented Sep 3, 2024

@sunzhongkai588 @luotao1 请帮忙看一下,是 RFC 提交的地方不对吗?PaddlePaddle/Paddle#67603 我看还一直是“报名”状态 ~ 🫠

@sunzhongkai588
Copy link
Contributor

sunzhongkai588 commented Sep 3, 2024

@sunzhongkai588 @luotao1 请帮忙看一下,是 RFC 提交的地方不对吗?PaddlePaddle/Paddle#67603 我看还一直是“报名”状态 ~ 🫠

好像是因为脚本忘记监控 community 这个仓库了

Copy link
Contributor

@HydrogenSulfate HydrogenSulfate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

- 在 Paddle 的 `Adam, AdamW` 接口中,增加 `amsgrad` 选项,使其支持 `AMSGrad` 算法。
- 在 PaddleScience 的 `Adam, AdamW` 接口中,增加 `amsgrad` 选项,使其支持 `AMSGrad` 算法。

> **说明** PaddleScience 的 `Adam, AdamW` 优化器是通过调用 Paddle 的相应优化器实现,而 Paddle 的 `Adam, AdamW` 优化算法通过调用后台 c++ 算子实现,`AMSGrad` 所需要的 `历史平方梯度的最大值` 也需要在 c++ 算子中实现,因此,需要通过修改 Paddle 的 `Adam, AdamW` 优化器接口,从而支持 `AMSGrad`,而无法单独在 PaddleScience 中支持 `AMSGrad`。(单独、且只在 PaddleScience 中实现 `AMSGrad` 优化器,不在本文讨论范围之内。)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,这功能主要是框架的C++算子支持

1000,
find_master,
False,
self._amsgrad, # 标记位
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里需要将 amsgrad作为self的成员变量吗?可能直接amsgrad就行了?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是我代码引用的太少了,应该还是不可以 ... ... 😅

class Adam(Optimizer):
    def __init__(
        self,
        learning_rate: float | LRScheduler = 0.001,
        beta1: float | Tensor = 0.9,
        beta2: float | Tensor = 0.999,
        epsilon: float | Tensor = 1e-8,
        parameters: (
            Sequence[Tensor] | Sequence[_AdamParameterConfig] | None
        ) = None,
        weight_decay: float | WeightDecayRegularizer | None = None,
        grad_clip: GradientClipBase | None = None,
        lazy_mode: bool = False,
        multi_precision: bool = False,
        use_multi_tensor: bool = False,
        amsgrad: bool = False,      # 标记位
        name: str | None = None,
    ) -> None:
      ...
      self._amsgrad = amsgrad      # 标记位

  ...

  def _append_optimize_op(self, block, param_and_grad):
    ...

    _ = _C_ops.adam_(       # 调用底层算子
        param_and_grad[0],
        param_and_grad[1],
        lr,
        moment1,
        moment2,
        moment2_max,        # 输入参数,最大值
        beta1_pow_acc,
        beta2_pow_acc,
        master_weight,
        found_inf,
        _beta1,
        _beta2,
        self._epsilon,
        self._lazy_mode,
        1000,
        find_master,
        False,
        self._amsgrad,      # 标记位
    )

底层算子的调用在另一个方法里面,Adam 初始化的时候传入 amsgrad,需要一个私有变量 (self._amsgrad) 进行传递 ~

已在文档中补充代码 ~

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

具体代码的修改是否涉及分布式的逻辑呢?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

涉及 spmd rule 的相应代码修改,已经在文档中补充了 ~

Copy link
Contributor

@HydrogenSulfate HydrogenSulfate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HydrogenSulfate HydrogenSulfate merged commit 48cbc61 into PaddlePaddle:master Sep 4, 2024
1 check passed
@luotao1 luotao1 changed the title 【Hackathon 7th No.12】Adam、AdamW 优化器支持 amsgrad 【Hackathon 7th PPSCI No.12】Adam、AdamW 优化器支持 amsgrad Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants