Hi Kuan-Yu,
Sorry about bothering you.
After I search on the CIFAR10 dataset, I get one type of genotype, which is familiar with the reported case in the original paper.
However, when I derive the final architecture and calculate the FLOPs and latency, it seems a little strange.
For example, I run:
from model import NetworkCIFAR as Network
import genotypes
genotype = genotype = eval("genotypes.%s" % "PCDARTS")
with torch.cuda.device(0):
model = Network(36, 1000, 14, True, genotype)
model.drop_path_prob = 0.3
model.eval()
flops, params = get_model_complexity_info(model, (3, 224, 224), as_strings=True, print_per_layer_stat=True)
print("{:<30} {:<8}".format("Computational complexity: ", flops))
print("{:<30} {:<8}".format("Number of parameters: ", params))
The reported model complexity and the number of parameters for the searched genotypes (with 14 layers under ImageNet setting with image size 224x224) are as follows:
Computational complexity: 20.11 GMac
Number of parameters: 4.3 M
But when I run resnet50 for comparison:
from torchvision.models import resnet50
with torch.cuda.device(0):
model = resnet50(pretrained=False)
flops, params = get_model_complexity_info(model, (3, 224, 224), as_strings=True,
print_per_layer_stat=True)
print('{:<30} {:<8}'.format('Computational complexity: ', flops))
print('{:<30} {:<8}'.format('Number of parameters: ', params))
The reported model complexity and the number of parameters for resnet50 are as follows:
Computational complexity: 4.12 GMac
Number of parameters: 25.56 M
The reported FLOPs in the original paper on ImageNet setting is only 597M. It seems there is something wrong with my derived final architecture (but I am sure the searched genotype is definitely correct). Here I mean after I get the searched model, I want to deploy it on some hardware devices. The latency for the searched genotype (with 14 layers) is nearly ten times as the resnet50, which is unacceptable.
At your convenience, could you help to give clarifications about how to derive the final architecture with a searched genotype? For the future, I will consider to deploy the searched model on some hardware devices and try to add some hardware-aware constraints for optimizing the overall design.
Although this is not in tensorflow, the idea is similar. I just want to figure out how to correctly export the final searched models (here I mean the stack of several genotypes) and then apply those models in other tasks.
Thanks for your time and have a nice day!