By Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu.
We provide config files to reproduce the results in the paper for "GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond" on COCO object detection.
GCNet is initially described in arxiv. Via absorbing advantages of Non-Local Networks (NLNet) and Squeeze-Excitation Networks (SENet), GCNet provides a simple, fast and effective approach for global context modeling, which generally outperforms both NLNet and SENet on major benchmarks for various recognition tasks.
@article{cao2019GCNet,
title={GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond},
author={Cao, Yue and Xu, Jiarui and Lin, Stephen and Wei, Fangyun and Hu, Han},
journal={arXiv preprint arXiv:1904.11492},
year={2019}
}
The results on COCO 2017val are shown in the below table.
Backbone | Model | Context | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|---|
R-50-FPN | Mask | GC(c3-c5, r16) | 1x | 4.5 | 0.533 | 10.1 | 38.5 | 35.1 | model |
R-50-FPN | Mask | GC(c3-c5, r4) | 1x | 4.6 | 0.533 | 9.9 | 38.9 | 35.5 | model |
R-101-FPN | Mask | GC(c3-c5, r16) | 1x | 7.0 | 0.731 | 8.6 | 40.8 | 37.0 | model |
R-101-FPN | Mask | GC(c3-c5, r4) | 1x | 7.1 | 0.747 | 8.6 | 40.8 | 36.9 | model |
Backbone | Model | Context | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|---|
R-50-FPN | Mask | - | 1x | 3.9 | 0.543 | 10.2 | 37.2 | 33.8 | model |
R-50-FPN | Mask | GC(c3-c5, r16) | 1x | 4.5 | 0.547 | 9.9 | 39.4 | 35.7 | model |
R-50-FPN | Mask | GC(c3-c5, r4) | 1x | 4.6 | 0.603 | 9.4 | 39.9 | 36.2 | model |
R-101-FPN | Mask | - | 1x | 5.8 | 0.665 | 9.2 | 39.8 | 36.0 | model |
R-101-FPN | Mask | GC(c3-c5, r16) | 1x | 7.0 | 0.778 | 9.0 | 41.1 | 37.4 | model |
R-101-FPN | Mask | GC(c3-c5, r4) | 1x | 7.1 | 0.786 | 8.9 | 41.7 | 37.6 | model |
X-101-FPN | Mask | - | 1x | 7.1 | 0.912 | 8.5 | 41.2 | 37.3 | model |
X-101-FPN | Mask | GC(c3-c5, r16) | 1x | 8.2 | 1.055 | 7.7 | 42.4 | 38.0 | model |
X-101-FPN | Mask | GC(c3-c5, r4) | 1x | 8.3 | 1.037 | 7.6 | 42.9 | 38.5 | model |
X-101-FPN | Cascade Mask | - | 1x | - | - | - | 44.7 | 38.3 | model |
X-101-FPN | Cascade Mask | GC(c3-c5, r16) | 1x | - | - | - | 45.9 | 39.3 | model |
X-101-FPN | Cascade Mask | GC(c3-c5, r4) | 1x | - | - | - | 46.5 | 39.7 | model |
X-101-FPN | DCN Cascade Mask | - | 1x | - | - | - | 47.1 | 40.4 | model |
X-101-FPN | DCN Cascade Mask | GC(c3-c5, r16) | 1x | - | - | - | 47.9 | 40.9 | model |
X-101-FPN | DCN Cascade Mask | GC(c3-c5, r4) | 1x | - | - | - | 47.9 | 40.8 | model |
Notes:
- The
SyncBN
is added in the backbone for all models in Table 2. GC
denotes Global Context (GC) block is inserted after 1x1 conv of backbone.DCN
denotes replace 3x3 conv with 3x3 Deformable Convolution inc3-c5
stages of backbone.r4
andr16
denote ratio 4 and ratio 16 in GC block respectively.