mindspore-lab · Ash-Lee233 · Jul 22, 2024 · Jul 19, 2024
diff --git a/configs/pvt/README.md b/configs/pvt/README.md
@@ -4,28 +4,40 @@
 
 ## Introduction
 
-PVT is a general backbone network for dense prediction without convolution operation. PVT introduces a pyramid structure in Transformer to generate multi-scale feature maps for dense prediction tasks. PVT uses a gradual reduction strategy to control the size of the feature maps through the patch embedding layer, and proposes a spatial reduction attention (SRA) layer to replace the traditional multi head attention layer in the encoder, which greatly reduces the computing/memory overhead.[[1](#References)]
+PVT is a general backbone network for dense prediction without convolution operation. PVT introduces a pyramid structure
+in Transformer to generate multi-scale feature maps for dense prediction tasks. PVT uses a gradual reduction strategy to
+control the size of the feature maps through the patch embedding layer, and proposes a spatial reduction attention (SRA)
+layer to replace the traditional multi head attention layer in the encoder, which greatly reduces the computing/memory
+overhead.[[1](#References)]
 
 ![PVT](https://user-images.githubusercontent.com/74176172/210046926-2322161b-a963-4603-b3cb-86ecdca41262.png)
 
 ## Results
 
 Our reproduced model performance on ImageNet-1K is reported as follows.
 
+performance tested on ascend 910*(8p) with graph mode
+
 <div align="center">
 
-|   Model    | Context  | Top-1 (%) | Top-5 (%) | Params (M) | Recipe                                                                                       | Download                                                                         |
-|:----------:|:--------:|:---------:|:---------:|:----------:|----------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
-|  pvt_tiny  | D910x8-G |   74.81   |   92.18   |   13.23    | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvt/pvt_tiny_ascend.yaml)   | [weights](https://download.mindspore.cn/toolkits/mindcv/pvt/pvt_tiny-6abb953d.ckpt)   |
-| pvt_small  | D910x8-G |   79.66   |   94.71   |   24.49    | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvt/pvt_small_ascend.yaml)  | [weights](https://download.mindspore.cn/toolkits/mindcv/pvt/pvt_small-213c2ed1.ckpt)  |
-| pvt_medium | D910x8-G |   81.82   |   95.81   |   44.21    | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvt/pvt_medium_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/pvt/pvt_medium-469e6802.ckpt) |
-| pvt_large  | D910x8-G |   81.75   |   95.70   |   61.36    | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvt/pvt_large_ascend.yaml)  | [weights](https://download.mindspore.cn/toolkits/mindcv/pvt/pvt_large-bb6895d7.ckpt)  |
+|  Model   | Top-1 (%) | Top-5 (%) | Params (M) | Batch Size | Recipe                                                                                     | Download                                                                                          |
+|:--------:|:---------:|:---------:|:----------:|------------|--------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|
+| pvt_tiny |   74.88   |   92.12   |   13.23    | 128        | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvt/pvt_tiny_ascend.yaml) | [weights](https://download-mindspore.osinfra.cn/toolkits/mindcv/pvt/pvt_tiny-6676051f-910v2.ckpt) |
+
+</div>
+
+performance tested on ascend 910(8p) with graph mode
+
+<div align="center">
+
+|  Model   | Top-1 (%) | Top-5 (%) | Params (M) | Batch Size | Recipe                                                                                     | Download                                                                            |
+|:--------:|:---------:|:---------:|:----------:|------------|--------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|
+| pvt_tiny |   74.81   |   92.18   |   13.23    | 128        | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvt/pvt_tiny_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/pvt/pvt_tiny-6abb953d.ckpt) |
 
 </div>
 
 #### Notes
 
-- Context: Training context denoted as {device}x{pieces}-{MS mode}, where mindspore mode can be G - graph mode or F - pynative mode with ms function. For example, D910x8-G is for training on 8 pieces of Ascend 910 NPU using graph mode.
 - Top-1 and Top-5: Accuracy reported on the validation set of ImageNet-1K.
 
 ## Quick Start
@@ -38,26 +50,31 @@ Please refer to the [installation instruction](https://github.com/mindspore-lab/
 
 #### Dataset Preparation
 
-Please download the [ImageNet-1K](https://www.image-net.org/challenges/LSVRC/2012/index.php) dataset for model training and validation.
+Please download the [ImageNet-1K](https://www.image-net.org/challenges/LSVRC/2012/index.php) dataset for model training
+and validation.
 
 ### Training
 
 - Distributed Training
 
-It is easy to reproduce the reported results with the pre-defined training recipe. For distributed training on multiple Ascend 910 devices, please run
+It is easy to reproduce the reported results with the pre-defined training recipe. For distributed training on multiple
+Ascend 910 devices, please run
 
 ```shell
 # distributed training on multiple GPU/Ascend devices
 mpirun -n 8 python train.py --config configs/pvt/pvt_tiny_ascend.yaml --data_dir /path/to/imagenet
 ```
+
 > If the script is executed by the root user, the `--allow-run-as-root` parameter must be added to `mpirun`.
 > If use Ascend 910 devices, need to open SATURATION_MODE via `export MS_ASCEND_CHECK_OVERFLOW_MODE="SATURATION_MODE"`
 
 Similarly, you can train the model on multiple GPU devices with the above `mpirun` command.
 
-For detailed illustration of all hyper-parameters, please refer to [config.py](https://github.com/mindspore-lab/mindcv/blob/main/config.py).
+For detailed illustration of all hyper-parameters, please refer
+to [config.py](https://github.com/mindspore-lab/mindcv/blob/main/config.py).
 
-**Note:** As the global batch size (batch_size x num_devices) is an important hyper-parameter, it is recommended to keep the global batch size unchanged for reproduction or adjust the learning rate linearly to a new global batch size.
+**Note:** As the global batch size (batch_size x num_devices) is an important hyper-parameter, it is recommended to keep
+the global batch size unchanged for reproduction or adjust the learning rate linearly to a new global batch size.
 
 * Standalone Training
 
@@ -67,20 +84,24 @@ If you want to train or finetune the model on a smaller dataset without distribu
 # standalone training on a CPU/GPU/Ascend device
 python train.py --config configs/pvt/pvt_tiny_ascend.yaml --data_dir /path/to/imagenet --distribute False
 ```
+
 > If use Ascend 910 devices, need to open SATURATION_MODE via `export MS_ASCEND_CHECK_OVERFLOW_MODE="SATURATION_MODE"`
 
 ### Validation
 
-To validate the accuracy of the trained model, you can use `validate.py` and parse the checkpoint path with `--ckpt_path`.
+To validate the accuracy of the trained model, you can use `validate.py` and parse the checkpoint path
+with `--ckpt_path`.
 
 ```shell
 python validate.py --model=pvt_tiny --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
 ```
 
 ### Deployment
 
-To deploy online inference services with the trained model efficiently, please refer to the [deployment tutorial](https://mindspore-lab.github.io/mindcv/tutorials/deployment/).
+To deploy online inference services with the trained model efficiently, please refer to
+the [deployment tutorial](https://mindspore-lab.github.io/mindcv/tutorials/deployment/).
 
 ## References
 
-[1]. Wang W, Xie E, Li X, et al. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 568-578.
+[1]. Wang W, Xie E, Li X, et al. Pyramid vision transformer: A versatile backbone for dense prediction without
+convolutions[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 568-578.
diff --git a/configs/pvtv2/README.md b/configs/pvtv2/README.md
@@ -1,4 +1,5 @@
 # PVTV2
+
 > [PVT v2: Improved Baselines with Pyramid Vision Transformer](https://arxiv.org/abs/2106.13797)
 
 ## Introduction
@@ -20,38 +21,49 @@ segmentation.[[1](#references)]
 
 Our reproduced model performance on ImageNet-1K is reported as follows.
 
+performance tested on ascend 910*(8p) with graph mode
+
+<div align="center">
+
+|   Model   | Top-1 (%) | Top-5 (%) | Params (M) | Batch Size | Recipe                                                                                        | Download                                                                                              |
+|:---------:|:---------:|:---------:|:----------:|------------|-----------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------|
+| pvt_v2_b0 |   71.25   |   90.50   |    3.67    | 128        | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvtv2/pvt_v2_b0_ascend.yaml) | [weights](https://download-mindspore.osinfra.cn/toolkits/mindcv/pvt_v2/pvt_v2_b0-d9cd9d6a-910v2.ckpt) |
+
+</div>
+
+performance tested on ascend 910(8p) with graph mode
+
 <div align="center">
 
-| Model     | Context  | Top-1 (%) | Top-5 (%) | Params (M) | Recipe                                                                                         | Download                                                                                |
-|-----------|----------|-----------|-----------|------------|------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|
-| pvt_v2_b0 | D910x8-G | 71.50     | 90.60     | 3.67       | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvtv2/pvt_v2_b0_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/pvt_v2/pvt_v2_b0-1c4f6683.ckpt) |
-| pvt_v2_b1 | D910x8-G | 78.91     | 94.49     | 14.01      | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvtv2/pvt_v2_b1_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/pvt_v2/pvt_v2_b1-3ceb171a.ckpt) |
-| pvt_v2_b2 | D910x8-G | 81.99     | 95.74     | 25.35      | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvtv2/pvt_v2_b2_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/pvt_v2/pvt_v2_b2-0565d18e.ckpt) |
-| pvt_v2_b3 | D910x8-G | 82.84     | 96.24     | 45.24      | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvtv2/pvt_v2_b3_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/pvt_v2/pvt_v2_b3-feaae3fc.ckpt) |
-| pvt_v2_b4 | D910x8-G | 83.14     | 96.27     | 62.56      | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvtv2/pvt_v2_b4_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/pvt_v2/pvt_v2_b4-1cf4bc03.ckpt) |
+|   Model   | Top-1 (%) | Top-5 (%) | Params (M) | Batch Size | Recipe                                                                                        | Download                                                                                |
+|:---------:|:---------:|:---------:|:----------:|------------|-----------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|
+| pvt_v2_b0 |   71.50   |   90.60   |    3.67    | 128        | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/pvtv2/pvt_v2_b0_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/pvt_v2/pvt_v2_b0-1c4f6683.ckpt) |
 
 </div>
 
 #### Notes
 
-- Context: Training context denoted as {device}x{pieces}-{MS mode}, where mindspore mode can be G - graph mode or F - pynative mode with ms function. For example, D910x8-G is for training on 8 pieces of Ascend 910 NPU using graph mode.
 - Top-1 and Top-5: Accuracy reported on the validation set of ImageNet-1K.
 
 ## Quick Start
 
 ### Preparation
 
 #### Installation
+
 Please refer to the [installation instruction](https://github.com/mindspore-ecosystem/mindcv#installation) in MindCV.
 
 #### Dataset Preparation
-Please download the [ImageNet-1K](https://www.image-net.org/challenges/LSVRC/2012/index.php) dataset for model training and validation.
+
+Please download the [ImageNet-1K](https://www.image-net.org/challenges/LSVRC/2012/index.php) dataset for model training
+and validation.
 
 ### Training
 
 * Distributed Training
 
-It is easy to reproduce the reported results with the pre-defined training recipe. For distributed training on multiple Ascend 910 devices, please run
+It is easy to reproduce the reported results with the pre-defined training recipe. For distributed training on multiple
+Ascend 910 devices, please run
 
 ```shell
 # distrubted training on multiple GPU/Ascend devices
@@ -62,9 +74,11 @@ mpirun -n 8 python train.py --config configs/pvtv2/pvt_v2_b0_ascend.yaml --data_
 
 Similarly, you can train the model on multiple GPU devices with the above `mpirun` command.
 
-For detailed illustration of all hyper-parameters, please refer to [config.py](https://github.com/mindspore-lab/mindcv/blob/main/config.py).
+For detailed illustration of all hyper-parameters, please refer
+to [config.py](https://github.com/mindspore-lab/mindcv/blob/main/config.py).
 
-**Note:**  As the global batch size  (batch_size x num_devices) is an important hyper-parameter, it is recommended to keep the global batch size unchanged for reproduction or adjust the learning rate linearly to a new global batch size.
+**Note:**  As the global batch size  (batch_size x num_devices) is an important hyper-parameter, it is recommended to
+keep the global batch size unchanged for reproduction or adjust the learning rate linearly to a new global batch size.
 
 * Standalone Training
 
@@ -77,7 +91,8 @@ python train.py --config configs/pvtv2/pvt_v2_b0_ascend.yaml --data_dir /path/to
 
 ### Validation
 
-To validate the accuracy of the trained model, you can use `validate.py` and parse the checkpoint path with `--ckpt_path`.
+To validate the accuracy of the trained model, you can use `validate.py` and parse the checkpoint path
+with `--ckpt_path`.
 
 ```shell
 python validate.py -c configs/pvtv2/pvt_v2_b0_ascend.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
@@ -89,4 +104,5 @@ Please refer to the [deployment tutorial](https://mindspore-lab.github.io/mindcv
 
 ## References
 
-[1] Wang W, Xie E, Li X, et al. Pvt v2: Improved baselines with pyramid vision transformer[J]. Computational Visual Media, 2022, 8(3): 415-424.
+[1] Wang W, Xie E, Li X, et al. Pvt v2: Improved baselines with pyramid vision transformer[J]. Computational Visual
+Media, 2022, 8(3): 415-424.