https://github.com/vita-group/architecture_convergence

[Neurips 2022] "Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis" by Wuyang Chen*, Wei Huang*, Xinyu Gong, Boris Hanin, Zhangyang Wang
https://github.com/vita-group/architecture_convergence

convergence-analysis neural-architectures ntk

Last synced: 6 months ago
JSON representation

[Neurips 2022] "Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis" by Wuyang Chen*, Wei Huang*, Xinyu Gong, Boris Hanin, Zhangyang Wang

Host: GitHub
URL: https://github.com/vita-group/architecture_convergence
Owner: VITA-Group
License: mit
Created: 2022-05-11T16:49:43.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2022-10-11T18:31:18.000Z (almost 3 years ago)
Last Synced: 2025-03-29T09:42:09.626Z (6 months ago)
Topics: convergence-analysis, neural-architectures, ntk
Language: Python
Homepage:
Size: 543 KB
Stars: 6
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [[PDF](https://arxiv.org/pdf/2205.05662.pdf)]

[![MIT licensed](https://img.shields.io/badge/license-MIT-brightgreen.svg)](LICENSE.md)

Wuyang Chen*, Wei Huang*, Xinyu Gong, Boris Hanin, Zhangyang Wang

In Neurips 2022.

[code under development]

## Overview

We link the convergence rate of a network with its architecture topology (connectivity patterns), and further guide the efficient neural architecture design.

Highlights:
* We first theoretically analyze the convergence of gradient descent of diverse neural network architectures, and find the connectivity patterns largely impact their bound of convergence rate.
* From the theoretical analysis, we abstract two practical principles on designing the network’s connectivity pattern: "effective depth" $\bar{d}$ and "effective width" $\bar{m}$.
* Both our convergence analysis and principles on effective depth/width are verified by experiments on diverse architectures and datasets. Our method can further significantly accelerate the neural architecture search without introducing any extra cost.

201

## Prerequisites
- Ubuntu 16.04
- Python 3.6.9
- CUDA 10.1 (lower versions may work but were not tested)
- NVIDIA GPU + CuDNN v7.3

This repository has been tested on GTX 1080Ti. Configurations may need to be changed on different platforms.

## Usage
### 1. `mlp_code`
This code base is for training an MLP network defined by an arbitrary DAG (directed acyclic graph) (e.g. three examples in our Figure 3).

### 2. TENAS + DAG
Modified from [TENAS](https://github.com/VITA-Group/TENAS).
On top of TENAS, we further reduce the search cost by avoid evaluating supernet of moderate depth and width.

### 3. WOT + DAG
Modified from [WOT](https://github.com/BayesWatch/nas-without-training).
On top of WOT (Neural Architecture Search Without Training), we further reduce the search cost by avoid evaluating bad architectures of extreme depth or extreme width.

## Citation
```
@article{chen2022deep,
title={Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis},
author={Chen, Wuyang and Huang, Wei and Gong, Xinyu and Hanin, Boris and Wang, Zhangyang},
journal={Advances in neural information processing systems},
year={2022}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/vita-group/architecture_convergence

Awesome Lists containing this project

README