TY - JOUR
T1 - RATs-NAS
T2 - Redirection of Adjacent Trails on Graph Convolutional Networks for Predictor-Based Neural Architecture Search
AU - Zhang, Yu Ming
AU - Hsieh, Jun Wei
AU - Lee, Chun Chieh
AU - Fan, Kuo Chin
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2024
Y1 - 2024
N2 - Manually designed convolutional neural networks (CNNs) architectures such as visual geometry group network (VGG), ResNet, DenseNet, and MobileNet have achieved high performance across various tasks, but design them is time-consuming and costly. Neural architecture search (NAS) automates the discovery of effective CNN architectures, reducing the need for experts. However, evaluating candidate architectures requires significant graphics processing unit (GPU) resources, leading to the use of predictor-based NAS, such as graph convolutional networks (GCN), which is the popular option to construct predictors. However, we discover that, even though the ability of GCN mimics the propagation of features of real architectures, the binary nature of the adjacency matrix limits its effectiveness. To address this, we propose redirection of adjacent trails (RATs), which adaptively learns trail weights within the adjacency matrix. Our RATs-GCN outperform other predictors by dynamically adjusting trail weights after each graph convolution layer. Additionally, the proposed divide search sampling (DSS) strategy, based on the observation of cell-based NAS that architectures with similar floating point operations (FLOPs) perform similarly, enhances search efficiency. Our RATs-NAS, which combine RATs-GCN and DSS, shows significant improvements over other predictor-based NAS methods on NASBench-101, NASBench-201, and NASBench-301.
AB - Manually designed convolutional neural networks (CNNs) architectures such as visual geometry group network (VGG), ResNet, DenseNet, and MobileNet have achieved high performance across various tasks, but design them is time-consuming and costly. Neural architecture search (NAS) automates the discovery of effective CNN architectures, reducing the need for experts. However, evaluating candidate architectures requires significant graphics processing unit (GPU) resources, leading to the use of predictor-based NAS, such as graph convolutional networks (GCN), which is the popular option to construct predictors. However, we discover that, even though the ability of GCN mimics the propagation of features of real architectures, the binary nature of the adjacency matrix limits its effectiveness. To address this, we propose redirection of adjacent trails (RATs), which adaptively learns trail weights within the adjacency matrix. Our RATs-GCN outperform other predictors by dynamically adjusting trail weights after each graph convolution layer. Additionally, the proposed divide search sampling (DSS) strategy, based on the observation of cell-based NAS that architectures with similar floating point operations (FLOPs) perform similarly, enhances search efficiency. Our RATs-NAS, which combine RATs-GCN and DSS, shows significant improvements over other predictor-based NAS methods on NASBench-101, NASBench-201, and NASBench-301.
KW - Cell-based NAS
KW - neural architecture search (NAS)
KW - predictor-based NAS
UR - http://www.scopus.com/inward/record.url?scp=85205442289&partnerID=8YFLogxK
U2 - 10.1109/TAI.2024.3465433
DO - 10.1109/TAI.2024.3465433
M3 - 期刊論文
AN - SCOPUS:85205442289
SN - 2691-4581
VL - 5
SP - 6672
EP - 6682
JO - IEEE Transactions on Artificial Intelligence
JF - IEEE Transactions on Artificial Intelligence
IS - 12
ER -