Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/MINGUKKANG/ENAS-Tensorflow

Efficient Neural Architecture search via parameter sharing(ENAS) micro search Tensorflow code for windows user
https://github.com/MINGUKKANG/ENAS-Tensorflow

Last synced: 2 months ago
JSON representation

Efficient Neural Architecture search via parameter sharing(ENAS) micro search Tensorflow code for windows user

Awesome Lists containing this project

README

        

## ENAS-Tensorflow

I will explain the code of Efficient Neural Architecture Search(ENAS), especially case of micro search.

Unlike the author's code, This code can work in a windows 10 enviroment and you can use png files as datasets.

Also you can apply data augmentation using "n_aug_img" which is explained below.

## Enviroment
- OS: Window 10(Ubuntu 16.04 is possible)

- Graphic Card /RAM : 1080TI /32G

- Python 3.5

- Tensorflow-gpu version: 1.4.0rc2

- OpenCV 3.4.1

## How to run

**
At first, you should unpack the attached data as shown below.**

![사진1](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/unpack.PNG)

**
Next, You should change the code below to suit your situation.**

```

DEFINE_string("output_dir", "./output" , "")
DEFINE_string("train_data_dir", "./data/train", "")
DEFINE_string("val_data_dir", "./data/valid", "")
DEFINE_string("test_data_dir", "./data/test", "")
DEFINE_integer("channel",1, "MNIST: 1, Cifar10: 3")
DEFINE_integer("img_size", 32, "enlarge image size")
DEFINE_integer("n_aug_img",1 , "if 2: num_img: 55000 -> aug_img: 110000, elif 1: False")
```
It is recommended to set "n_aug_img" = 1 to find the child network, and to use 2 ~ 4 to train the found child network.

**
Then, You can train Controller of ENAS with the following short code:**
```
python main_controller_child_trainer.py
```
**
After finishing, you can train the child network with the following code:**

```
Case of MNIST

python main_child_trainer.py --child_fixed_arc "1 2 1 3 0 1 0 4 1 1 1 1 0 1 0 1 1 0 0 1 0 1 0 4 1 0 2 0 0 3 1 1 0 0 0 0 4 1 1 0"
```

```
Case of Cifar 10

python main_child_trainer.py --child_fixed_arc "1 0 1 1 1 1 0 0 1 1 0 0 0 3 0 3 1 3 1 1 1 1 0 3 0 3 0 3 1 3 0 1 1 3 0 2 0 3 1 0"
```

```
Case of Welding Defects

python main_child_trainer.py --child_fixed_arc "1 0 0 1 0 0 1 1 2 2 1 1 1 1 1 2 1 0 0 0 0 0 0 3 2 2 1 0 2 0 2 3 0 3 4 0 1 0 3 2"
```

The string in the above code like "1 2 1 3 0 1 ~ " is the result of main_controller_child_trainer.py

The first 20 numbers are for the architecture for convolution layers, and the rest are for pooling layers.

## Result

### 1. ENAS cells discoved in the micro search space

After training , we got the following child_arc_seq and visualized it as shown below.

#### MNIST

```
"1 2 1 3 0 1 0 4 1 1 1 1 0 1 0 1 1 0 0 1 0 1 0 4 1 0 2 0 0 3 1 1 0 0 0 0 4 1 1 0"
```


![사진2](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/MNIST_convCell.png)


![사진3](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/MNIST_Reduction_cell.png)

#### CIFAR 10

```
"1 0 1 1 1 1 0 0 1 1 0 0 0 3 0 3 1 3 1 1 1 1 0 3 0 3 0 3 1 3 0 1 1 3 0 2 0 3 1 0"
```


![사진2](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/CIFAR10_Convolution_cell.png)


![사진3](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/CIFAR10_Reduction_cell.png)

#### Welding Defects

```
"1 0 0 1 0 0 1 1 2 2 1 1 1 1 1 2 1 0 0 0 0 0 0 3 2 2 1 0 2 0 2 3 0 3 4 0 1 0 3 2"
```


![사진2](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/Welding_Defects_Convolutional_Cell.png)


![사진3](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/Welding_Defects_Reduction_Cell.png)

### 2. Final structure of the child network

#### MNIST

![사진4](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/MNIST_final_network.png)

#### CIFAR 10

![사진4](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/CIFAR10_final_network.png)

#### Welding Defects

![사진4](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/Welding_final_network.png)

### 3. Test Accuracy

```
MNIST
Test Accuracy : 99.77%
```

```
CIFAR 10
Test Accuracy :
```

```
Welding Defects
Test Accuracy : 100.00%
```

### 4. Graphs

Controller Validation Accuracy(reward)

ChildNetwork Loss & Test Accuracy for MNIST Dataset

ChildNetwork Loss & Test Accuracy for Welding Defects Dataset

## Explained

### 1. Controller

First, we will build the sampler as shown in the picture below.


![사진5](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/Controller_init.png)


Then we will make controller using sampler's output "next_c_1, next_h_1".


![사진6](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/Controller.PNG)


After getting the "next_c_5, next_h_5", you must do the following to renew "Anchors, Anchors_w_1".


![사진7](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/Anchors_appen.PNG)

### 2. Controller_Loss

To enable the Controller to make better networks, ENAS uses REINFORCE with a moving average baseline to reduce variance.

```python

for all index:
curr_log_prob = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=index)
log_prob += curr_log_prob
curr_ent = tf.stop_gradient(tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=tf.nn.softmax(logits)))
entropy += curr_ent

for all op_id:
curr_log_prob = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=op_id)
log_prob += curr_log_prob
curr_ent = tf.stop_gradient(tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=tf.nn.softmax(logits)))
entropy += curr_ent

arc_seq_1, entropy_1, log_prob_1, c, h = self._build_sampler(use_bias=True) # for convolution cell
arc_seq_2, entropy_2, log_prob_2, _, _ = self._build_sampler(prev_c=c, prev_h=h) # for reduction cell
self.sample_entropy = entropy_1 + entropy_2
self.sample_log_prob = log_prob_1 + log_prob_2
```

```python

self.valid_acc = (tf.to_float(child_model.valid_shuffle_acc) /
tf.to_float(child_model.batch_size))
self.reward = self.valid_acc

if self.entropy_weight is not None:
self.reward += self.entropy_weight * self.sample_entropy

self.sample_log_prob = tf.reduce_sum(self.sample_log_prob)
self.baseline = tf.Variable(0.0, dtype=tf.float32, trainable=False)
baseline_update = tf.assign_sub(
self.baseline, (1 - self.bl_dec) * (self.baseline - self.reward))

with tf.control_dependencies([baseline_update]):
self.reward = tf.identity(self.reward)

self.loss = self.sample_log_prob * (self.reward - self.baseline)
```

### 3. Child Network

(1) Schematic of Child Network


![사진8](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/Schematic_child_network.png)

(2) _enas_layers

```python

def _enas_layers(self, layer_id, prev_layers, arc, out_filters):
'''
   prev_layers : previous two layers. ex) layers[●,●]
  ●'s shape = [None, H, W, C]
   arc: "0 1 0 1 0 3 0 0 2 2 0 2 1 0 0 1 1 3 0 1 1 1 0 1 0 1 2 1 0 0 0 0 0 0 1 3 1 1 0 1"
out = [self._enas_conv(x, curr_cell, prev_cell, 3, out_filters),
self._enas_conv(x, curr_cell, prev_cell, 5, out_filters),
avg_pool,
max_pool,
x]
'''

retrun output # calculated by arc, np.shape(output) = [None, H, W, out_filters]
# if child_fixed_arc is not None, np.shape(output) = [None, H, W, n*out_filters]
# where n is the number of not being used nodes in the coonv cell or Reduction cell.
```

(3) factorized_reduction

```python

def factorized_reduction(self, x, out_filters, strides = 2, is_training = True):
'''
   x : x is last previous layer's output.
out_filters: 2*(previous layer's channel)
'''

stride_spec = self._get_strides(stride) # [1,2,2,1]

# Skip path 1
path1 = tf.nn.avg_pool(x, [1, 1, 1, 1], stride_spec, "VALID", data_format=self.data_format)

with tf.variable_scope("path1_conv"):
inp_c = self._get_C(path1)
w = create_weight("w", [1, 1, inp_c, out_filters // 2])
path1 = tf.nn.conv2d(path1, w, [1, 1, 1, 1], "VALID", data_format=self.data_format)

# Skip path 2
# First pad with 0"s on the right and bottom, then shift the filter to
# include those 0"s that were added.
if self.data_format == "NHWC":
pad_arr = [[0, 0], [0, 1], [0, 1], [0, 0]]
path2 = tf.pad(x, pad_arr)[:, 1:, 1:, :]
concat_axis = 3
else:
pad_arr = [[0, 0], [0, 0], [0, 1], [0, 1]]
path2 = tf.pad(x, pad_arr)[:, :, 1:, 1:]
concat_axis = 1

path2 = tf.nn.avg_pool(path2, [1, 1, 1, 1], stride_spec, "VALID", data_format=self.data_format)
with tf.variable_scope("path2_conv"):
inp_c = self._get_C(path2)
w = create_weight("w", [1, 1, inp_c, out_filters // 2])
path2 = tf.nn.conv2d(path2, w, [1, 1, 1, 1], "VALID", data_format=self.data_format)

# Concat and apply BN
final_path = tf.concat(values=[path1, path2], axis=concat_axis)
final_path = batch_norm(final_path, is_training, data_format=self.data_format)

return final_path
```

(4) _maybe_calibrate_size

```python

def _maybe_calibrate_size(self, layers, out_filters, is_training):
"""Makes sure layers[0] and layers[1] have the same shapes."""
hw = [self._get_HW(layer) for layer in layers]
c = [self._get_C(layer) for layer in layers]

with tf.variable_scope("calibrate"):
x = layers[0]
if hw[0] != hw[1]:
assert hw[0] == 2 * hw[1]
with tf.variable_scope("pool_x"):
x = tf.nn.relu(x)
x = self._factorized_reduction(x, out_filters, 2, is_training)
elif c[0] != out_filters:
with tf.variable_scope("pool_x"):
w = create_weight("w", [1, 1, c[0], out_filters])
x = tf.nn.relu(x)
x = tf.nn.conv2d(x, w, [1, 1, 1, 1], "SAME", data_format=self.data_format)
x = batch_norm(x, is_training, data_format=self.data_format)

y = layers[1]
if c[1] != out_filters:
with tf.variable_scope("pool_y"):
w = create_weight("w", [1, 1, c[1], out_filters])
y = tf.nn.relu(y)
y = tf.nn.conv2d(y, w, [1, 1, 1, 1], "SAME", data_format=self.data_format)
y = batch_norm(y, is_training, data_format=self.data_format)
return [x, y]
```

(5) Others

You can see more details of the child network in

### 4. Summary of learning mechanism

```
1. Train the Child Network during 1 Epoch. (Momentum optimization)
※ 1 Epoch = (Total data size / batch size) times parameters update.

2. Train the controller 'FLAGS.controller_train_steps x FLAGS.controller_num_aggregate' times. (Adam Optimization)

3. Repeat "1", "2" as many as we want.(160 Epochs)

4. Choose the child network architecture with the highest validation accuracy.
```

```
1. Train the child Network which is selected above as many as we want. (Momentum optimization, 660 Epochs)
```

## Augmentation

### 1. Code

```python
def aug(image, idx):
augmentation_dic = {0: enlarge(image, 1.2),
1: rotation(image),
2: random_bright_contrast(image),
3: gaussian_noise(image),
4: Flip(image)}

image = augmentation_dic[idx]
return image
```

Function enlarge, rotation, random_bright_contrast and Flip are writen using cv2.

In the case of MNIST Data, I do not apply flip! you can check more details in

### 2. Images

## Graphs

#### MNIST
![사진9](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/MNIST_AUG.png)

#### CIFAR10
![사진9](https://github.com/MINGUKKANG/ENAS-Tensorflow/blob/master/images/Cifar10_AUG.png)

#### Welding Defects

Welding OK
Welding NG


## References
**Paper: https://arxiv.org/abs/1802.03268**

**Autors' implementation: https://github.com/melodyguan/enas**

**Data Pipeline: https://github.com/MINGUKKANG/MNIST-Tensorflow-Code**

## License
All rights related to this code are reserved to the author of ENAS

(Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean)