woshidandan / tanet Goto Github PK

[IJCAI 2022, Official Code] for paper "Rethinking Image Aesthetics Assessment: Models, Datasets and Benchmarks". Official Weights and Demos provided. 首个面向多主题场景的美学评估数据集、算法和benchmark.

License: Apache License 2.0

Python 100.00%

iaa pytorch tad66k image-aesthetics-assessment

tanet's Introduction

[国内的小伙伴请看更详细的中文说明]This repo contains the official implementation and the new IAA dataset TAD66K of the IJCAI 2022 paper. Our new work on ICCV2023：Link

Rethinking Image Aesthetics Assessment: Models, Datasets and Benchmarks

Shuai He, Yongchang Zhang, Rui Xie, Dongxiang Jiang, Anlong Ming
Beijing University of Posts and Telecommunications

TAD66K

Introduction

We build a large-scale dataset called the Theme and Aesthetics Dataset with 66K images (TAD66K), which is specifically designed for IAA. Specifically, (1) it is a theme-oriented dataset containing 66K images covering 47 popular themes. All images were carefully selected by hand based on the theme. (2) In addition to common aesthetic criteria, we provide 47 criteria for the 47 themes. Images of each theme are annotated independently, and each image contains at least 1200 effective annotations (so far the richest annotations). These high-quality annotations could help to provide deeper insight into the performance of models.

Download Dataset

Download from here google, it contains images with the largest side scaled to 800, and labels categorized by different themes.
or here baidu, code: 8888

TANet

Introduction

We propose a baseline model, called the Theme and Aesthetics Network (TANet), which can maintain a constant perception of aesthetics to effectively deal with the problem of attention dispersion. Moreover, TANet can adaptively learn the rules for predicting aesthetics according to a recognized theme. By comparing 17 methods with TANet on three representative datasets: AVA, FLICKR-AES and the proposed TAD66K, TANet achieves state-of-the-art performance on all three datasets.

Environment Installation

pandas==0.22.0
nni==1.8
requests==2.18.4
torchvision==0.8.2+cu101
numpy==1.13.3
scipy==0.19.1
tqdm==4.43.0
torch==1.7.1+cu101
scikit_learn==1.0.2
tensorboardX==2.5

How to Run the Code

We used the hyperparameter tuning tool nni, maybe you should know how to use this tool first (it will only take a few minutes of your time), because our training and testing will be in this tool.
Train or test, please run: nnictl create --config config.yml -p 8999
The Web UI urls are: http://127.0.0.1:8999 or http://172.17.0.3:8999
Note: nni is not necessary, if you don't want to use this tool, just make simple modifications to our code, such as changing param_group['lr'] to param_group.lr, etc.
PS: The work of train on the FLICKR-AES dataset may not be made public, because we are currently cooperating with a company, and the relevant model has been embedded into the system, and there are some confidentiality requirements.

If you find our work is useful, pleaes cite our paper:

@article{herethinking,
  title={Rethinking Image Aesthetics Assessment: Models, Datasets and Benchmarks},
  author={He, Shuai and Zhang, Yongchang and Xie, Rui and Jiang, Dongxiang and Ming, Anlong},
  journal={IJCAI},
  year={2022},
}

Try!

TANet.real-time.inference.video.1.mp4

TANet.real-time.inference.video.2.mp4

TANet.real-time.inference.video.3.mp4

tanet's People

Contributors

Stargazers

Watchers

Forkers

mrobotit zivzone delldu alberthch 0x1355 daiqi1989 liulizhou linhong00316 azure-dragon-ai

tanet's Issues

关于nni训练nni.report_intermediate_result在trial中只输出一次信息的问题

想问问大佬们在训练TAD666K数据集时是否遇到过题目中出现的情况

我遇到的情况大致如下：

在nni的WebUI中每次trial只输出epoch=0的中间实验结果，但是在log文件中可以看到多个epoch的train与validate的进度条。
如果不使用nni进行训练，则在运行框中输出结果正常。

此外还想问一下作者，保存模型的代码是否需要自己写，以及作者训练出来的优秀指标大概是多少。

关于nni训练

您好，我尝试使用nni训练，输入nnictl create --config config.yml -p 8999后，出现错误： ERROR: "config.yml" is not a valid file. 期待您的答复

Which hyperparameter should be set by individually

Hi, I have successfully run the code, which is exactly the same as yours.

(base) PS C:\Users\96502> cd E:\Project\HeterogeneousComputing\TANet-main\TANet-main\code\TAD66K
(base) PS E:\Project\HeterogeneousComputing\TANet-main\TANet-main\code\TAD66K> conda activate TANet_py39
(TANet_py39) PS E:\Project\HeterogeneousComputing\TANet-main\TANet-main\code\TAD66K> nnictl create --config config.yml -p 8999
INFO:  expand searchSpacePath: search_space.json to E:\Project\HeterogeneousComputing\TANet-main\TANet-main\code\TAD66K\search_space.json
INFO:  expand codeDir: . to E:\Project\HeterogeneousComputing\TANet-main\TANet-main\code\TAD66K\.
INFO:  Starting restful server...
INFO:  Successfully started Restful server!
INFO:  Setting local config...
INFO:  Successfully set local config!
INFO:  Starting experiment...
INFO:  Successfully started experiment!

The experiment id is Jo44BgAs
The Web UI urls are: http://169.254.52.43:8999   http://169.254.179.254:8999   http://169.254.150.240:8999   http://10.32.94.167:8999   http://169.254.37.46:8999   http://169.254.73.179:8999   http://172.25.32.1:8999   http://127.0.0.1:8999
------------------------------------------------------------------------------------
You can use these commands to get more information about the experiment
------------------------------------------------------------------------------------
         commands                       description
1. nnictl experiment show        show the information of experiments
2. nnictl trial ls               list all of trial jobs
3. nnictl top                    monitor the status of running experiments
4. nnictl log stderr             show stderr log content
5. nnictl log stdout             show stdout log content
6. nnictl stop                   stop an experiment
7. nnictl trial kill             kill a trial job by id
8. nnictl --help                 get help information about nnictl
------------------------------------------------------------------------------------
Command reference document https://nni.readthedocs.io/en/latest/Tutorial/Nnictl.html
------------------------------------------------------------------------------------

But I don't know why experiments in visual interfaces always fail.

Have you ever encountered such a problem?
And here are configurations in my case:

’config.yml‘

authorName: default
experimentName: HyperNet_NNI_Come_on
trialConcurrency: 1
maxExecDuration: 1000h
maxTrialNum: 200
#choice: local, remote, pai
trainingServicePlatform: local
searchSpacePath: search_space.json
#choice: true, false
useAnnotation: false
tuner:
  #choice: TPE, Random, Anneal, Evolution, BatchTuner, MetisTuner, GPTuner
  #SMAC (SMAC should be installed through nnictl)
  builtinTunerName: TPE
  classArgs:
    #choice: maximize, minimize
    optimize_mode: maximize
trial:
  command: python train_nni.py
  codeDir: .
  gpuNum: 1
localConfig:
  useActiveGpu: true

The modifications in ‘option.py’

def init():
    parser = argparse.ArgumentParser(description="PyTorch")
    parser.add_argument('--path_to_images', type=str, default='E:\\dataset\\TAD66K_jmg',
                        help='directory to images')
    parser.add_argument('--path_to_save_csv', type=str,default="./dataset/dataset/merge/",
                        help='directory to csv_folder')
    parser.add_argument('--experiment_dir_name', type=str, default='.',
                        help='directory to project')

Sincerely hope to receive your reply!

论文无法下载，可以重新提供链接吗，谢谢！

【请求】数据集上传到平台

可以麻烦手里有数据集的大佬们传一份到kaggle上吗QAQ
百度云太慢，googledrive的话VPN流量撑不住
跪等
感谢

How to get a pretrained backbone?

Can you share the download links for these two pretrained models:

TANet/code/AVA/train_nni.py

Line 144 in 848a862

model_file = './resnet18_places365.pth.tar'

TANet/code/AVA/train_nni.py

Line 157 in 848a862

 path_to_model = '/root/tmp/pycharm_project_815/M_M_Semi-Supervised/code/AVA/pretrain_model/mobilenetv2.pth.tar' 

python的版本

Hi, 我在配置实验环境的时候使用了python3.5 3.6 3.7 三个版本都无法成功配置，想问问大佬原始版本是哪个

关于模型代码

你好，模型代码里有两处地方不是很懂，可以帮忙看一下吗？

class TargetNet 里的 forward 函数

    def forward(self, x, paras):

        q = self.fc1(x)
        # print(q.shape)
        q = self.bn1(q)
        q = self.relu1(q)
        q = self.drop1(q) 

        self.lin = nn.Sequential(TargetFC(paras['res_last_out_w'], paras['res_last_out_b']))
        q = self.lin(q)
        q = self.softmax(q)
        return q

其中 res_last_out_w 的 shape 是 [batch_size, 100]， res_last_out_b 是 [batch_size, 1]，self.lin 的输入 tensor 的 shape 是 [batch_size, 100]，这样 self.lin 的输出 tensor 的 shape 为 [batch_size, batch_size]，是一个 shape 与 batch_size 相关的 tensor，这样如果 batch_size 为 1 的话，这个函数输出的 tensor 的 shape 就固定为 [1, 1]，值也就固定为 1，这样等于主题网络部分输出一个固定为 1 的值，应该是有点问题？

Attention 函数里

def Attention(x):
    batch_size, in_channels, h, w = x.size()
    quary = x.view(batch_size, in_channels, -1)
    key = quary
    quary = quary.permute(0, 2, 1)

    sim_map = torch.matmul(quary, key)

    ql2 = torch.norm(quary, dim=2, keepdim=True)
    kl2 = torch.norm(key, dim=1, keepdim=True)
    sim_map = torch.div(sim_map, torch.matmul(ql2, kl2).clamp(min=1e-8))

    return sim_map

这里的实现跟论文里说的似乎不一样？这里的做法应该是 value 的 similarity_map 除以归一化值的 similarity_map，而非论文里说的常规 attention 去掉 V。

载入预训练好的权重的推理结果

直接载入预训练好的权重在AVA和TAD66K上推理，指标似乎并不能达到模型上所写的，想问问原因是什么？
如AVA上直接载入权重的结果为lcc：0.764 srcc：0.755，TAD上为lcc：0.526 srcc：0.507

pretrained model在哪里？似乎直接load SRCC_513_LCC_531_MSE_016.pth效果并不好

不同batch size时inference结果不同

#3 中提到的问题1会导致分类的支路输出的结果与batch size有关，当batch size是不同时，inference结果也不同，并且bs=1时，分类的支路输出固定为1；

请问怎么解决不同batch size时inference结果不同的问题呢？
在端侧使用的时应该bs=1的情况很多，但此时分类的分支输出是固定为1的，就起不到作用了吧？

Some questions about reproducibility

Dear author,

As mentioned in the supplementary materials, the hyperparameters were set as follows:

To train our TANet, we used Microsoft's neural network intelligence (NNI) tuning tool, where the learning rate search space from L1 to L6 was set as [0.000001, 0.0000001, 0.0000003], without any decay rate strategy. Specifically, we set the input size to 40 and the number of training epochs to 200. We used 224 × 224 crops from 256 × 256 fixed images as input.

However, achieving the desired training results in a single step with this particular combination of hyperparameters has proven to be challenging due to the absence of a decay rate strategy. I wonder if you performed multiple training steps, manually adjusting the learning rate to obtain the best results. For example, training for a certain number of epochs with one set of hyperparameters and then building upon that by training further, ensuring the total number of training epochs reaches 200, rather than directly training for 200 epochs using a specific set of hyperparameters. If this is the case, how can the ablation experiment be performed to ensure the accuracy of the experiment? Additionally, the provision of random seeds was not mentioned.

I would greatly appreciate your insights and guidance on this matter.

Warm regards,

Vanessa

Hyperparameters to replicate the paper's results

Hi, thank you for your great work!

I was wondering if you could share the hyperparameters you ended up searching for and using so that we can quickly reproduce the results of the paper. It is certainly possible to search for hyperparameters using nni, but I think it would be time consuming and pointless process.

Thanks again!

Results interpretation

Hello, thanks for the work and for providing the code.

I ran the eval() on batch of 8 images an here's the result:

([[0.0145, 0.0029, 0.0645, 0.2702, 0.4544, 0.1338, 0.0399, 0.0154, 0.0011,
0.0032],
[0.0290, 0.0171, 0.0896, 0.2024, 0.2972, 0.1970, 0.0950, 0.0444, 0.0119,
0.0164],
[0.0121, 0.0162, 0.0500, 0.1337, 0.2570, 0.2802, 0.1702, 0.0536, 0.0113,
0.0157],
[0.0291, 0.0355, 0.0795, 0.1375, 0.1992, 0.2212, 0.1488, 0.0780, 0.0376,
0.0337],
[0.0518, 0.0196, 0.1209, 0.2296, 0.2699, 0.1419, 0.0686, 0.0596, 0.0187,
0.0192],
[0.0348, 0.0399, 0.0847, 0.1438, 0.1954, 0.2086, 0.1374, 0.0795, 0.0407,
0.0353],
[0.0304, 0.0156, 0.0889, 0.2146, 0.3070, 0.1885, 0.0862, 0.0429, 0.0105,
0.0153],
[0.0561, 0.0247, 0.1231, 0.2177, 0.2498, 0.1417, 0.0734, 0.0669, 0.0242,
0.0224]])

Could you explan better how to interpretate the scores ?

【请教】训练出来的模型如果很大，如何嵌入到移动端？有模型压缩的方法吗？

如题

Inference for images?

Hi,

Is there a separate inference file for easier testing?

请问文中提出的包含AVA, FLICKR- AES和TAD66的测试集可以提供吗？

你好，
请问提供下载的test数据，是否即是文中提到的测试集？

Error on single image inference, thanks

I have tried to use the model to do aesthetics assessment on a single image, and below is my code snippet:

`
class IAA:
def init(self, model_file, weights_path):
self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
self.init_model(model_file, weights_path)
normalize = transforms.Normalize(mean=IMAGE_NET_MEAN, std=IMAGE_NET_STD)
self.transform = transforms.Compose([
transforms.ToTensor(),
normalize
])

def init_model(self, model_file, weights_path):
    self.model = TANet(model_file)
    self.model.load_state_dict(torch.load(weights_path))
    self.model.to(self.device)
    self.model.eval()

def inference(self, image_path):
    with open(image_path, 'rb') as f:
        image = Image.open(f)
        image = image.convert('RGB')
        image = image.resize((224, 224))
        image = self.transform(image)
        image = torch.unsqueeze(image, dim=0)
        image = image.to(self.device)
        iia_score = self.model(image)
    return iia_score

if name == 'main':
weights_path = r'.\weights\SRCC_758_LCC_765.pth'
model_file = r'.\weights\resnet18_places365.pth.tar'
image_path = r'.\samples\00000.png'
iaa = IAA(model_file, weights_path)
score = iaa.inference(image_path)

However, I got an error: ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 1]).

It seems this is a reported error when using batchnorm on a single image (batch size =1), see here (https://discuss.pytorch.org/t/error-expected-more-than-1-value-per-channel-when-training/26274/67). However, it was suggested that adding model.eval() can solve the problem. As you can see, I used model.eval(), but still got the error.

Did you happen to encounter to this error? If so, how did you solve it? Thanks a lot.

woshidandan / tanet Goto Github PK

tanet's Introduction

Rethinking Image Aesthetics Assessment: Models, Datasets and Benchmarks

Shuai He, Yongchang Zhang, Rui Xie, Dongxiang Jiang, Anlong Ming Beijing University of Posts and Telecommunications

TAD66K

Introduction

Download Dataset

TANet

Introduction

Environment Installation

How to Run the Code

If you find our work is useful, pleaes cite our paper:

Try!

tanet's People

Contributors

Stargazers

Watchers

Forkers

tanet's Issues

Recommend Projects

Recommend Topics

Recommend Org

Shuai He, Yongchang Zhang, Rui Xie, Dongxiang Jiang, Anlong Ming
Beijing University of Posts and Telecommunications