graphnas / graphnas Goto Github PK

View Code? Open in Web Editor NEW

171.0 2.0 38.0 852 KB

This directory contains code necessary to run the GraphNAS algorithm.

License: Apache License 2.0

Python 100.00%

graph-neural-networks neural-architecture-search

graphnas's Introduction

GraphNAS

Overview

Graph Neural Architecture Search (GraphNAS for short) enables automatic design of the best graph neural architecture based on reinforcement learning. This directory contains code necessary to run GraphNAS. Specifically, GraphNAS first uses a recurrent network to generate variable-length strings that describe the architectures of graph neural networks, and then trains the recurrent network with a policy gradient algorithm to maximize the expected accuracy of the generated architectures on a validation data set. An illustration of GraphNAS is shown as follows:

A recurrent network (Controller RNN) generates descriptions of graph neural architectures (Child model GNNs). Once an architecture is generated by the controller, GraphNAS trains the architecture m on a given graph and test on a validate set . The validation result is taken as the reward of the recurrent network.

To improve the search efficiency of GraphNAS, we restrict the search space from an entire architecture to a concatenation of the best search results built on each single architecture layer. An example of GraphNAS constructing a single GNN layer (the right-hand side) is shown as follows:

In the above example, the layer has two input states and , two intermediate states and , and an output state . The controller at the left-hand side samples from ${O_1, O_2, O_3}$ and takes as input of , and then samples "GAT" for processing . The output state collects information from and , and the controller assigns a readout operator "add" and an activation operator "relu" for . As a result, this layer can be described as a list of operators: [0, gcn, 1, gat, add, relu].

Requirements

Recent versions of PyTorch, numpy, scipy, sklearn, dgl, torch_geometric and networkx are required. Ensure that PyTorch 1.1.0 and CUDA 9.0 are installed. Then run:

pip install torch==1.1.0 -f https://download.pytorch.org/whl/cu90/torch_stable.html
pip install -r requirements.txt

If you want to run in docker, you can run:

docker build -t graphnas -f DockerFile . 
docker run -it -v $(pwd):/GraphNAS graphnas python -m eval_scripts.semi.eval_designed_gnn

Running the code

Architecture evaluation

To evaluate our best architecture designed on semi-supervised experiments by training from scratch, run

python -m eval_scripts.semi.eval_designed_gnn

To evaluate our best architecture designed on semi-supervised experiments by training from scratch, run

python -m eval_scripts.sup.eval_designed_gnn

Results

Semi-supervised node classification w.r.t. accuracy

Model	Cora	Citeseer	Pubmed
GCN	81.5+/-0.4	70.9+/-0.5	79.0+/-0.4
SGC	81.0+/-0.0	71.9+/-0.1	78.9+/-0.0
GAT	83.0+/-0.7	72.5+/-0.7	79.0+/-0.3
LGCN	83.3+/-0.5	73.0+/-0.6	79.5+/-0.2
DGCN	82.0+/-0.2	72.2+/-0.3	78.6+/-0.1
ARMA	82.8+/-0.6	72.3+/-1.1	78.8+/-0.3
APPNP	83.3+/-0.6	71.8+/-0.4	80.2+/-0.2
simple-NAS	81.4+/-0.6	71.7+/-0.6	79.5+/-0.5
GraphNAS	84.3+/-0.4	73.7+/-0.2	80.6+/-0.2

Supervised node classification w.r.t. accuracy

Model	Cora	Citeseer	Pubmed
GCN	90.2+/-0.0	80.0+/-0.3	87.8+/-0.2
SGC	88.8+/-0.0	80.6+/-0.0	86.5+/-0.1
GAT	89.5+/-0.3	78.6+/-0.3	86.5+/-0.6
LGCN	88.7+/-0.5	79.2+/-0.4	OOM
DGCN	88.4+/-0.2	78.0+/-0.2	88.0+/-0.9
ARMA	89.8+/-0.1	79.9+/-0.6	88.1+/-0.2
APPNP	90.4+/-0.2	79.2+/-0.4	87.4+/-0.3
random-NAS	90.0+/-0.3	81.1+/-0.3	90.7+/-0.6
simple-NAS	90.1+/-0.3	79.6+/-0.5	88.5+/-0.2
GraphNAS	90.6+/-0.3	81.3+/-0.4	91.3+/-0.3

Architectures designed in supervised learning are showed as follow:

The architecture G-Cora designed by GraphNAS on Cora is [0, gat6, 0, gcn, 0, gcn, 2, arma, tanh, concat], the architecture G-Citeseer designed by GraphNAS on Citeseer is [0, identity, 0, gat6, linear, concat], the architecture G-Pubmed designed by GraphNAS on Pubmed is [1, gat8, 0, arma, tanh, concat].

Searching for new architectures

To design an entire graph neural architecture based on the search space described in Section 3.2, please run:

python -m graphnas.main --dataset Citeseer

To design an entire graph neural architecture based on the search space described in Section 3.4, please run:

python -m graphnas.main --dataset Citeseer --supervised True --search_mode micro

Be aware that different runs would end up with different local minimum.

Acknowledgements

This repo is modified based on DGL and PYG.

graphnas's People

Contributors

Stargazers

Watchers

graphnas's Issues

What's the difference between '/model/geo/geo_gnn.py' and '/model/gnn.py'?

Dear author:
I found you have some distinct implementation when given different dataset name, such as "Cora" vs "cora". May i ask what the difference between the implementation under model folder directly and the implementation under model/geo/?

        if self.args.dataset in ["cora", "citeseer", "pubmed"]:
            self.shared = CitationGNNManager(self.args)
            self.controller = models.GNNNASController(self.args, cuda=self.args.cuda,
                                                      num_layers=self.args.layers_of_child_model)

        if self.args.dataset in ["Cora", "Citeseer", "Pubmed"]:
            self.shared = GeoCitationManagerManager(self.args)
            self.controller = models.GNNNASController(self.args, cuda=self.args.cuda,
                                                      num_layers=self.args.layers_of_child_model)

Examples of using GraphNAS?

Are there any examples of using GraphNAS? In addition, is the code of GraphNAS++ not public?

Running with Docker with error occurred: "ModuleNotFoundError: No module named 'torch_scatter.scatter_cuda'"

@GraphNAS Hi, I tried to run GraphNAS in docker according to Readme, while this error occurred every time. After google it for a while, it seems to be related with the installation of torch_scatter as mentioned in pytorch_geometric doc

Do you have any suggestion about fixing this error? Can you help check the Docker model ?
Thanks very much.

What does `simple` mean?

Dear author:
thanks for kindly sharing. I wonder what's the difference between this simple version and the version in paper?
thank you.

The Docker file does not work

Hi,

When I want to set up the docker for this repo, it does not work. Instead, I rewrite a new docker file that can work. and I paste it here.

### Dockerfile with Ubuntu 18.04 and cuda 9.0
### Changes are indicated by CHANGED
### Everything else was copied together from the original Dockerfiles (as per comments)

### 1st part from https://gitlab.com/nvidia/cuda/blob/ubuntu18.04/10.0/base/Dockerfile

FROM ubuntu:18.04
# CHANGED
#LABEL maintainer "NVIDIA CORPORATION <[email protected]>"
LABEL maintainer="tobycheese https://github.com/tobycheese/"

# CHANGED: below, add the two repos from 17.04 and 16.04 so all packages are found
RUN apt-get update && apt-get install -y --no-install-recommends gnupg2 curl ca-certificates && \
    curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub | apt-key add - && \
    echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/cuda.list && \
    echo "deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/nvidia-ml.list && \
    echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1704/x86_64 /" >> /etc/apt/sources.list.d/cuda.list && \
    echo "deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 /" >> /etc/apt/sources.list.d/nvidia-ml.list && \
    apt-get purge --autoremove -y curl && \
    rm -rf /var/lib/apt/lists/*

### end 1st part from from https://gitlab.com/nvidia/cuda/blob/ubuntu18.04/10.0/base/Dockerfile

### 2nd part from https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/base/Dockerfile

ENV CUDA_VERSION 9.0.176

ENV CUDA_PKG_VERSION 9-0=$CUDA_VERSION-1
RUN apt-get update && apt-get install -y --no-install-recommends \
        cuda-cudart-$CUDA_PKG_VERSION && \
    ln -s cuda-9.0 /usr/local/cuda && \
    rm -rf /var/lib/apt/lists/*

# CHANGED: commented out
# nvidia-docker 1.0
#LABEL com.nvidia.volumes.needed="nvidia_driver"
#LABEL com.nvidia.cuda.version="${CUDA_VERSION}"

RUN echo "/usr/local/nvidia/lib" >> /etc/ld.so.conf.d/nvidia.conf && \
    echo "/usr/local/nvidia/lib64" >> /etc/ld.so.conf.d/nvidia.conf

ENV PATH /usr/local/nvidia/bin:/usr/local/cuda/bin:${PATH}
ENV LD_LIBRARY_PATH /usr/local/nvidia/lib:/usr/local/nvidia/lib64

# nvidia-container-runtime
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility
ENV NVIDIA_REQUIRE_CUDA "cuda>=9.0"

### end 2nd part from https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/base/Dockerfile

### all of https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/runtime/Dockerfile

ENV NCCL_VERSION 2.3.7

RUN apt-get update && apt-get install -y --no-install-recommends \
        cuda-libraries-$CUDA_PKG_VERSION \
        cuda-cublas-9-0=9.0.176.4-1 \
        libnccl2=$NCCL_VERSION-1+cuda9.0 && \
    apt-mark hold libnccl2 && \
    rm -rf /var/lib/apt/lists/*

### end all of from https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/runtime/Dockerfile

### all of https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/devel/Dockerfile

RUN apt-get update && apt-get install -y --no-install-recommends \
        cuda-libraries-dev-$CUDA_PKG_VERSION \
        cuda-nvml-dev-$CUDA_PKG_VERSION \
        cuda-minimal-build-$CUDA_PKG_VERSION \
        cuda-command-line-tools-$CUDA_PKG_VERSION \
        cuda-core-9-0=9.0.176.3-1 \
        cuda-cublas-dev-9-0=9.0.176.4-1 \
        libnccl-dev=$NCCL_VERSION-1+cuda9.0 && \
    rm -rf /var/lib/apt/lists/*

ENV LIBRARY_PATH /usr/local/cuda/lib64/stubs

### end all of https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/devel/Dockerfile

### all of https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/devel/cudnn7/Dockerfile

ENV CUDNN_VERSION 7.4.1.5
LABEL com.nvidia.cudnn.version="${CUDNN_VERSION}"

RUN apt-get update && apt-get install -y --no-install-recommends \
            libcudnn7=$CUDNN_VERSION-1+cuda9.0 \
            libcudnn7-dev=$CUDNN_VERSION-1+cuda9.0 && \
    apt-mark hold libcudnn7 && \
    rm -rf /var/lib/apt/lists/*

### end all of https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/devel/cudnn7/Dockerfile


RUN apt update
RUN apt install -y software-properties-common \
                    python3-pip \
                    gcc-5 \
                    g++-5 


RUN unlink /usr/bin/gcc 
RUN unlink /usr/bin/g++ 

RUN ln -s /usr/bin/python3 /usr/bin/python && \
    ln -s /usr/bin/pip3 /usr/bin/pip && \
    ln -sv /usr/bin/gcc-5 /usr/bin/gcc &&\
    ln -sv /usr/bin/g++-5 /usr/bin/g++ 

WORKDIR /GraphNAS
COPY ./requirements.txt /GraphNAS

RUN pip install torch==1.1.0 -f https://download.pytorch.org/whl/cu90/torch_stable.html
RUN pip install -r requirements.txt

Run "python main.py" while segmentation fault occured.

The error can be reproduced. Any idea about this problem?

The environment information and error information is as follows:

### Environment info:
CentOS 7
anaconda: 4.7.1
python 3.6.1
pytorch 1.2.0
Cuda compilation tools, release 9.2, V9.2.148
NVRM version: NVIDIA UNIX x86_64 Kernel Module 430.09 Thu Apr 18 03:09:38 CDT 2019
GCC version: gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)

python main.py
2019-09-12 20:16:47,401:INFO::NAS-like mode: retrain without share param
Namespace(batch_size=1, controller_grad_clip=0, controller_lr=0.00035, controller_max_step=5, controller_optim='adam', cuda=True, dataset='Cora', derive_num_sample=100, discount=1.0, ema_baseline_decay=0.95, entropy_coeff=0.0001, entropy_mode='reward', epochs=200, in_drop=0.6, in_feats=1433, layers_of_child_model=3, load_path='', lr=0.005, max_epoch=200, max_param=5000000.0, max_save_num=5, mode='train', multi_label=False, num_class=7, optim_file='opt_cora_test.pkl', param_file='cora_test.pkl', random_seed=123, residual=True, retrain_epochs=200, save_epoch=5, search_mode='nas', share_param=False, shared_initial_step=0, shared_rnn_max_length=35, softmax_temperature=5.0, tanh_c=2.5, weight_decay=0.0005)
*********************************** training controller ***********************************
/home/xxxxxx/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py:1339: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
  warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
train action: ['gat', 'max', 'tanh', 1, 16, 'cos', 'sum', 'tanh', 2, 4, 'gat', 'mean', 'relu6', 2, 7]
Segmentation fault

no requests module in the docker file--Better to update the DockerFile

@GraphNAS Nice work and thanks for releasing this code.

I was trying to reproduce your work recently, and it is helpful you released a docker version.
Currently, the requests model is missing from the DockerFile, and it will be great for people if you can add the package "requests" in the DockerFile.

Add link to ArXiv paper

Please add a link to the corresponding ArXiv paper in the readme. The link is https://arxiv.org/abs/1904.09981

Parameter Sharing?

The code seems to allow for parameter sharing but these parts are not used right? I am wondering because similar experiments with GraphNAS are in the Auto-GNN paper...

PPI TRAINING :TypeError: index_select(): argument 'index' (position 3) must be Tensor, not NoneType

2019-06-24 03:22:10,338:INFO::NAS-like mode: retrain without share param Namespace(batch_size=64, controller_grad_clip=0, controller_lr=0.00035, controller_max_step=5, controller_optim='adam', cuda=True, dataset='PPI', derive_num_sample=100, discount=1.0, ema_baseline_decay=0.95, entropy_coeff=0.0001, entropy_mode='reward', epochs=50, in_drop=0, in_feats=50, layers_of_child_model=3, load_path='', lr=0.005, max_epoch=200, max_param=5000000.0, max_save_num=5, mode='train', multi_label=False, num_class=121, optim_file='opt_cora_test.pkl', param_file='cora_test.pkl', random_seed=123, residual=True, retrain_epochs=200, save_epoch=5, search_mode='nas', share_param=False, shared_initial_step=0, shared_rnn_max_length=35, softmax_temperature=5.0, tanh_c=2.5, weight_decay=0) *********************************** training controller *********************************** /home/XX/.local/lib/python3.6/site-packages/torch/nn/functional.py:1374: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead. warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.") train action: ['gat', 'sum', 'relu', 2, 8, 'linear', 'mlp', 'tanh', 2, 4, 'gat_sym', 'sum', 'linear', 6, 121] Traceback (most recent call last): File "main.py", line 111, in <module> main(args) File "main.py", line 100, in main trnr.train() File "/home/XX/ws2/GraphNAS-simple/trainer.py", line 105, in train self.train_controller() File "/home/XX/ws2/GraphNAS-simple/trainer.py", line 191, in train_controller results = self.get_reward(structure_list, np_entropies, hidden) File "/home/XX/ws2/GraphNAS-simple/trainer.py", line 151, in get_reward reward = self.shared.test_with_param(actions, with_retrain=self.with_retrain) File "/home/XX/ws2/GraphNAS-simple/models/geo/geo_gnn_ppi_manager.py", line 145, in test_with_param return self.train(actions, format) File "/home/XX/ws2/GraphNAS-simple/models/geo/geo_gnn_ppi_manager.py", line 95, in train total_loss = self.run_model(model, optimizer, loss_op) File "/home/XX/ws2/GraphNAS-simple/models/geo/geo_gnn_ppi_manager.py", line 119, in run_model loss = loss_op(model(data.x, data.edge_index), data.y) File "/home/XX/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/XX/ws2/GraphNAS-simple/models/geo/geo_gnn.py", line 70, in forward output = act(layer(output, edge_index_all) + fc(output)) File "/home/XX/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/XX/ws2/GraphNAS-simple/models/geo/geo_layer.py", line 109, in forward return self.propagate(edge_index, x=x, num_nodes=x.size(0)) File "/home/XX/ws2/GraphNAS-simple/models/geo/message_passing.py", line 81, in propagate tmp = torch.index_select(tmp, 0, edge_index[idx]) TypeError: index_select(): argument 'index' (position 3) must be Tensor, not NoneType