@article{weightstandardization,
author = {Siyuan Qiao and Huiyu Wang and Chenxi Liu and Wei Shen and Alan Yuille},
title = {Weight Standardization},
journal = {arXiv preprint arXiv:1903.10520},
year = {2019},
}
Faster R-CNN
Backbone | Style | Normalization | Lr schd | box AP | mask AP | Download |
---|---|---|---|---|---|---|
R-50-FPN | pytorch | GN | 1x | 37.8 | - | - |
R-50-FPN | pytorch | GN+WS | 1x | 38.9 | - | model |
R-101-FPN | pytorch | GN | 1x | 39.8 | - | - |
R-101-FPN | pytorch | GN+WS | 1x | 41.4 | - | model |
X-50-32x4d-FPN | pytorch | GN | 1x | 36.5 | - | - |
X-50-32x4d-FPN | pytorch | GN+WS | 1x | 39.9 | - | model |
X-101-32x4d-FPN | pytorch | GN | 1x | 33.2 | - | - |
X-101-32x4d-FPN | pytorch | GN+WS | 1x | 41.8 | - | model |
Mask R-CNN
Backbone | Style | Normalization | Lr schd | box AP | mask AP | Download |
---|---|---|---|---|---|---|
R-50-FPN | pytorch | GN | 2x | 39.9 | 36.0 | - |
R-50-FPN | pytorch | GN+WS | 2x | 40.3 | 36.2 | model |
R-101-FPN | pytorch | GN | 2x | 41.6 | 37.3 | - |
R-101-FPN | pytorch | GN+WS | 2x | 42.0 | 37.3 | model |
X-50-32x4d-FPN | pytorch | GN | 2x | 39.2 | 35.5 | - |
X-50-32x4d-FPN | pytorch | GN+WS | 2x | 40.7 | 36.7 | model |
X-101-32x4d-FPN | pytorch | GN | 2x | 36.4 | 33.1 | - |
X-101-32x4d-FPN | pytorch | GN+WS | 2x | 42.1 | 37.7 | model |
R-50-FPN | pytorch | GN | 20-23-24e | 40.6 | 36.6 | - |
R-50-FPN | pytorch | GN+WS | 20-23-24e | 41.1 | 37.0 | model |
R-101-FPN | pytorch | GN | 20-23-24e | 42.3 | 38.1 | - |
R-101-FPN | pytorch | GN+WS | 20-23-24e | 43.0 | 38.4 | model |
X-50-32x4d-FPN | pytorch | GN | 20-23-24e | 39.6 | 35.9 | - |
X-50-32x4d-FPN | pytorch | GN+WS | 20-23-24e | 41.9 | 37.7 | model |
X-101-32x4d-FPN | pytorch | GN | 20-23-24e | 36.6 | 33.4 | - |
X-101-32x4d-FPN | pytorch | GN+WS | 20-23-24e | 43.4 | 38.7 | model |
Note:
- GN+WS requires about 5% more memory than GN, and it is only 5% slower than GN.
- In the paper, a 20-23-24e lr schedule is used instead of 2x.
- The X-50-GN and X-101-GN pretrained models are also shared by the authors.