Skip to content

[Neurips 2022] "Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis" by Wuyang Chen*, Wei Huang*, Xinyu Gong, Boris Hanin, Zhangyang Wang

License

Notifications You must be signed in to change notification settings

VITA-Group/architecture_convergence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [PDF]

MIT licensed

Wuyang Chen*, Wei Huang*, Xinyu Gong, Boris Hanin, Zhangyang Wang

In Neurips 2022.

[code under development]

Overview

We link the convergence rate of a network with its architecture topology (connectivity patterns), and further guide the efficient neural architecture design.

Highlights:

  • We first theoretically analyze the convergence of gradient descent of diverse neural network architectures, and find the connectivity patterns largely impact their bound of convergence rate.
  • From the theoretical analysis, we abstract two practical principles on designing the network’s connectivity pattern: "effective depth" $\bar{d}$ and "effective width" $\bar{m}$.
  • Both our convergence analysis and principles on effective depth/width are verified by experiments on diverse architectures and datasets. Our method can further significantly accelerate the neural architecture search without introducing any extra cost.

201

Prerequisites

  • Ubuntu 16.04
  • Python 3.6.9
  • CUDA 10.1 (lower versions may work but were not tested)
  • NVIDIA GPU + CuDNN v7.3

This repository has been tested on GTX 1080Ti. Configurations may need to be changed on different platforms.

Usage

1. mlp_code

This code base is for training an MLP network defined by an arbitrary DAG (directed acyclic graph) (e.g. three examples in our Figure 3).

2. TENAS + DAG

Modified from TENAS. On top of TENAS, we further reduce the search cost by avoid evaluating supernet of moderate depth and width.

3. WOT + DAG

Modified from WOT. On top of WOT (Neural Architecture Search Without Training), we further reduce the search cost by avoid evaluating bad architectures of extreme depth or extreme width.

Citation

@article{chen2022deep,
  title={Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis},
  author={Chen, Wuyang and Huang, Wei and Gong, Xinyu and Hanin, Boris and Wang, Zhangyang},
  journal={Advances in neural information processing systems},
  year={2022}
}

About

[Neurips 2022] "Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis" by Wuyang Chen*, Wei Huang*, Xinyu Gong, Boris Hanin, Zhangyang Wang

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published