raschka-research-group / coral-cnn Goto Github PK

View Code? Open in Web Editor NEW

335.0 13.0 62.0 6.15 MB

Rank Consistent Ordinal Regression for Neural Networks with Application to Age Estimation

Home Page: https://www.sciencedirect.com/science/article/pii/S016786552030413X

License: MIT License

Python 67.06% Jupyter Notebook 32.94%

ordinal-regression pytorch deeplearning

coral-cnn's Introduction

Rank-consistent Ordinal Regression for Neural Networks

This repository contains the PyTorch model code for the paper

Wenzhi Cao, Vahid Mirjalili, Sebastian Raschka (2020): Rank Consistent Ordinal Regression for Neural Networks with Application to Age Estimation. Pattern Recognition Letters. https://doi.org/10.1016/j.patrec.2020.11.008.
[Journal Paper] [ArXiv Preprint]
[PyTorch Package] [Keras Port]

This GitHub repository contains the code files and training logs used in the paper. If you are primarily interested in using CORAL, a PyTorch library with Tutorials can be found here:

https://github.com/rasbt/coral_pytorch

PyTorch Model Code

Note that the model code across datasets is identical for the different datasets, however, we hard coded the file paths to the datasets at the top of the file and using dataloaders specific to the corresponding dataset organization. You likely need to change the file paths in the scripts depending on where you save the image datasets and label files if you wish to run the code.

All code was run on PyTorch 1.5 and Python 3.7, and we do not guarantee upward and downward compatibility to other PyTorch and Python versions.

The model code can be found in the [./model-code](./model-code) subdirectory, and the code files are labeled using the scheme

<dataset>-<loss>.py

<dataset> refers to either AFAD (afad), MORPH-2 (morph), or CACD (cacd).
<loss> refers to either CORAL (coral), ordinal regression as in Niu et al. (ordinal), or cross-entropy (ce).

Example

The following code trains coral on the afad dataset:

python afad-coral.py --seed 1 --cuda 0 --outpath afad-model1

--seed <int>: Integer for the random seed; used for training set shuffling and the model weight initialization (note that CUDA convolutions are not fully deterministic).
--cuda <int>: The CUDA device number of the GPU to be used for training (--cuda 0 refers to the 1st GPU).
--outpath <directory>: Path for saving the training log (training.log) and the parameters of the trained model (model.pt).

Here is an overview of the differences between a regular CNN and a CORAL-CNN:

(Click to see a high resolution version.)

Training Logs and Trained Models from the Paper

We share all training logs in this GitHub repository under the ./experiment-logs subdirectory. Due to the large file-size (85 Mb per model), we could not share the trained models on GitHub; however, all trained models can be downloaded from Google Drive via the following link: https://drive.google.com/drive/folders/168ijUQyvGLhHoQUQMlFS2fVt2p5ZV2bD?usp=sharing.

Image files

The image files of the face image datasets are available from the following websites:

CACD: http://bcsiriuschen.github.io/CARC/
AFAD: https://github.com/afad-dataset/tarball
MORPH-2: https://www.faceaginggroup.com/morph/

Data preprocessing code

We provide the dataset preprocessing code that we used to prepare the CACD and MORPH-2 datasets as described in the paper. The code is located in the [./datasets/image-preprocessing-code](./datasets/image-preprocessing-code) subdirectory. AFAD did not need further preprocessing.

Labels and train/test splits

We provide the age labels (obtained from the orginal dataset resources) and train/test splits we used in CSV format located in the ./datasets/train-test-csv subdirectory.

CACD: labels 0-48 correspond to ages 14-62
AFAD: labels 0-25 correspond to ages 15-40
MORPH-2: labels 0-54 correspond to ages 16-70

Using Trained Models

We share the pre-trained models from the paper that can be used to make predictions on AFAD, MORPH-2, or CACD images. Please see the README in the single-image-prediction__w-pretrained-models subdirectory for details.

Implementations for Other Deep Learning Frameworks

Porting Guide

Our models were originally implemented in PyTorch 1.5. We provide a recipe for porting the code is provided at coral-implementation-recipe.ipynb. Also see the the file-diff comparing CORAL with regular CNN.

Keras

A Keras port of this code was recently developed and made available at https://github.com/ck37/coral-ordinal.

coral-cnn's People

Contributors

Stargazers

Watchers

coral-cnn's Issues

Variables' meaning

Dear Sir,

First of all thanks for sharing this great repo

I have some questions. Hopefully, this will not bother you

In single-image-prediction/afad_coral.py, can you explain

what do logits , probas and predicted_levels mean?
why predicted_label = sum(predict_levels)?

I have had read the paper but still not been able to clarify those

Thank you for your time Sir

Originally posted by @tumbleintoyourheart in #6 (comment)

While traning my model i'm facing issue.

Epoch: 001/200 | Batch 0000/20149 | Cost: 70.1415
Epoch: 001/200 | Batch 0050/20149 | Cost: 59.7190
Epoch: 001/200 | Batch 0100/20149 | Cost: 56.4751
Epoch: 001/200 | Batch 0150/20149 | Cost: 58.4821
Epoch: 001/200 | Batch 0200/20149 | Cost: 56.8452
Epoch: 001/200 | Batch 0250/20149 | Cost: 59.0936
Epoch: 001/200 | Batch 0300/20149 | Cost: 54.9184
Epoch: 001/200 | Batch 0350/20149 | Cost: 53.4635
Epoch: 001/200 | Batch 0400/20149 | Cost: 52.2409
Epoch: 001/200 | Batch 0450/20149 | Cost: 51.1332
Epoch: 001/200 | Batch 0500/20149 | Cost: 57.5054
Epoch: 001/200 | Batch 0550/20149 | Cost: 53.7109
Epoch: 001/200 | Batch 0600/20149 | Cost: 58.1618
Epoch: 001/200 | Batch 0650/20149 | Cost: 53.6513
Epoch: 001/200 | Batch 0700/20149 | Cost: 55.9161
Epoch: 001/200 | Batch 0750/20149 | Cost: 55.2700
Epoch: 001/200 | Batch 0800/20149 | Cost: 52.1431
Epoch: 001/200 | Batch 0850/20149 | Cost: 54.5851
Epoch: 001/200 | Batch 0900/20149 | Cost: 62.3357
Epoch: 001/200 | Batch 0950/20149 | Cost: 53.9224
Epoch: 001/200 | Batch 1000/20149 | Cost: 57.4987
Epoch: 001/200 | Batch 1050/20149 | Cost: 59.1612
Epoch: 001/200 | Batch 1100/20149 | Cost: 52.0190
Epoch: 001/200 | Batch 1150/20149 | Cost: 59.5060
Epoch: 001/200 | Batch 1200/20149 | Cost: 57.0917
Epoch: 001/200 | Batch 1250/20149 | Cost: 53.7502
Epoch: 001/200 | Batch 1300/20149 | Cost: 62.6665
Epoch: 001/200 | Batch 1350/20149 | Cost: 50.6539
Epoch: 001/200 | Batch 1400/20149 | Cost: 51.1941
Traceback (most recent call last):
File "afad-coral.py", line 379, in
for batch_idx, (features, targets, levels) in enumerate(train_loader):
File "/home/administrator/gender_identification/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in next
return self._process_data(data)
File "/home/administrator/gender_identification/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/administrator/gender_identification/lib/python3.7/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/administrator/gender_identification/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/administrator/gender_identification/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/home/administrator/gender_identification/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 80, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/administrator/gender_identification/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 80, in
return [default_collate(samples) for samples in transposed]
File "/home/administrator/gender_identification/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 56, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 98 and 99 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:689

While training my custom dataset model I'm facing the issue . Is there any way to identify the issue file >

@rasbt @yienxu Please guide me.

Accuracy improves if do not share FC weight?

Thank you for your amazing work.
I wonder to know if the penultimate layer do not share weight (use a FC with num_dim*num_class parameters), will the accuracy improves since the model would get more capacity?

Qustion about the train.csv/test.csv?

I want to know how to generate the .csv files that you provided in this project?

When I scan the .csv files and the images data, I find that the age in .csv file is not consistent with the original image file.

there is an example: in the .csv file the age is 15, but in the image name ,the age is 29, and 29 is more confidable. in the csv file, there are many problems like this.

Coral cost function defferent with the definition in paper?

def cost_fn(logits, levels, imp):
    val = (-torch.sum((F.logsigmoid(logits)*levels
                      + (F.logsigmoid(logits) - logits)*(1-levels))*imp,
           dim=1))
    return torch.mean(val)

The second term of the cost seems a little different from the original definition,could you explain it?
thanks.

Loss function is different from the article

Am I right, that loss function in code is not the same that is described in the article (page 3, (4))? Why?

In file ./model-code/resnet34/cacd-coral.py:

def cost_fn(logits, levels, imp):
    val = (-torch.sum((F.log_softmax(logits, dim=2)[:, :, 1] * levels
                      + F.log_softmax(logits, dim=2)[:, :, 0]*(1-levels)) * imp, dim=1))
   return torch.mean(val)

In file ./model-code/resnet34/afad-coral.py:

def cost_fn(logits, levels, imp):
    val = (-torch.sum((F.logsigmoid(logits) * levels
                      + (F.logsigmoid(logits) - logits)*(1-levels)) * imp,
           dim=1))
    return torch.mean(val)

Why not F.logsigmoid(1 - logits) instead of (F.logsigmoid(logits) - logits)?

Questions on (rank-monotonic) and (num_classes - 1)

Hi, Thank you for your code implementation. I have three questions would like to ask:

in the paper, it says "require {fk} reflect the ordinal information and are rank-monotonic". I am wondering how the rank monotonic is ensured?
for the "levels" variable from your code, also the paper, why choose (num_classes - 1) instead of num_classes?

Thanks

General Paper Questions

Thanks for a great paper, it definitely solves the monotonic issue of the naive approach!

I have a few questions for the authors:

1.) Is the loss described equivalent to standard binary cross entropy? It appears so, but I noticed in the notebook you have your own derivation as opposed to using the standard PyTorch loss function. Is there a specific reason, or was that just to be explicit about the loss?

2.) While training with this loss is definitely more stable than MSE regression to fixed targets, at inference, is it true the output can be interpreted as regression to 1-dim with thresholds derived from the biases? For example, in a 3 class problem, the final bias weights may be something like [1.01, -1.08]. In this case, a linear layer output of

>= 1.09 will be class 2
1.08 to -1.00 will be class 1
<= -1.01 will be class 0

Just wondering if I’m interpreting this correctly for the following question.

3.) I’m finding an issue where center classes have reduced recall, which makes sense as there’s a narrower range of regression predictions for the correct label (described above). However, the pairwise AUC-ROC is still very high, as outer classes are predicted at extremes, while inner classes are mostly around their respective ranges. Did you also notice this? Were there any solutions? I’ve tried longer training runs, focal loss to prevent the outer classes from “squeezing” the hyperplanes defined by the biases together, and smoothed labels for outer classes. I’ve also tried initializing the biases further apart, and increasing the LR for the linear layer and biases.

Thanks again, and much appreciated if you’re able to address these fairly specific questions 😃

tf.keras implementation

Hello,

Is there possibly a tf.keras implementation of the ordinal layer, or any interest in implementing one? I am hoping to use the CORAL algorithm in an item response theory-based multitask model for hate speech measurement (https://hatespeech.berkeley.edu), but all of our code is in Keras currently. I don't know that we have the low-level technical capacity to make the conversion ourselves unfortunately.

Thanks,
Chris

test on single image

How to test the pretrained models on the single image?

some questions about progress-cacd

I have download the cacd dataset from the provided site. but I run the progress-cacd ,there comes some problems,
TypeError: call(): incompatible function arguments. The following argument types are supported:
1. (self: _dlib_pybind11.fhog_object_detector, image: array, upsample_num_times: int=0) -> _dlib_pybind11.rectangles
Invoked with: <_dlib_pybind11.fhog_object_detector object at 0x00000214FD970930>, None, 1。
I don't know how to fix it, please help me.

Interpretation of the probability ouput

Suppose there are 6 categories: 0, 1, 2, 3, 4, 5.
The probability output for one sample is [0.8, 0.6, 0.55, 0.45, 0.1]. So the prediction result for this sample will be category 3.
My question is, does this mean P(X=3) = P(X>2) - P(X>3) = 0.55- 0.45 = 0.1, the probability of the predicted category is only 0.1?

Model predicting a constant value for all images

Hello sir,
Firstly I would like to thank you for sharing this amazing repository.
I am using the VGG-16 coral.py on the UTKFace Dataset. After training the model I observed that the MAE was nearly 8.5 - 9 and was never changing irrespective of any changes in hyperparameters.
I printed the predicted_levels and actual targets and I found out that the model is predicting the same value for all images. After a few epochs the cost function almost becomes constant and the value of predicted_label is coming near 9 to 11 depending on how long the model has been trained.
I trained the model on a separate dataset: All Ages Faces Dataset and got the same issue there also.

I printed the following quantities for a few examples.

I also printed the predicted_labels for one batch (batch_size = 64).

I haven't changed anything in the code except adding the print statements for the above images. I am not able to figure out why this is happening. I would appreciate any insight as to why this might be happening.
Thanks!

Pretrained model

Hi, do you have any pretrained model so that it is possible to do inference using the network, instead of training everything from scratch?
Thanks

curious about the feature

if I change the num_classes to double counts, like num_classes=10, and I change it to num_classes= 20. while training, the probas with a sample in the end would become like this [0.9617, 0.9617, 0.9601, 0.9601, 0.9568, 0.9568, 0.9448, 0.9448, 0.9117, 0.9117, 0.8685, 0.8685, 0.8398, 0.8398, 0.8279, 0.8279, 0.8223, 0.8223]. why it is the same value along, I thought it would be 18 different value each other,,,I want to know what decide this and how to become what i expect...

ADD_CLASS

ADD_CLASS的值是怎么得到的？

predict result issue

I used my own data set. First, the picture work with preprocess-cacd.py, and then using cacd-coral.py to get the age prediction. I tested 5 pictures and it didn't work well. But the results from the test pictures you provided are very good. Is this a problem with the generalization of the model? Or others?

Task importance analysis

`def task_importance_weights(label_array):
uniq = torch.unique(label_array)
num_examples = label_array.size(0)

m = torch.zeros(uniq.shape[0])

for i, t in enumerate(torch.arange(torch.min(uniq), torch.max(uniq))):
    m_k = torch.max(torch.tensor([label_array[label_array > t].size(0), 
                                  num_examples - label_array[label_array > t].size(0)]))
    m[i] = torch.sqrt(m_k.float())

imp = m/torch.max(m)
return imp`

IF this code are diffrent from origin paper"Ordinal Regression with Multiple Output CNN for Age Estimation"

A question about level design

A question:
Line 165 In model-code/resnet34/afad-coral.py
levels = [1]*label + [0]*(NUM_CLASSES - 1 - label)
While in read me it says: "AFAD: labels 0-25 correspond to ages 15-40", which means age 15 will be label as 0, right? I check the afad_train.csv and do found some 0 labels. If age 15 is labeled as 0, then this code line creates level for age 15 with all 0 vectors. I am not very familiar with age estimation so don't know if all of you use such coding method, but if let me encode this, I will make age 15 get a 1 at least.

error in task_importance_weights function: afad-coral.py

Hi,

I think the following function will throw an error if the dataset does not have some age values represented:

def task_importance_weights(label_array):
    uniq = torch.unique(label_array)
    num_examples = label_array.size(0)

    m = torch.zeros(uniq.shape[0])

    for i, t in enumerate(torch.arange(torch.min(uniq), torch.max(uniq))):
        m_k = torch.max(torch.tensor([label_array[label_array > t].size(0), 
                                      num_examples - label_array[label_array > t].size(0)]))
        m[i] = torch.sqrt(m_k.float())

    imp = m/torch.max(m)
    return imp

For the AFAD training set, the line m = torch.zeros(uniq.shape[0]) will generate a tensor of shape 23 since 3 age label groups are missing from the training set (age labels 15, 22, and 24). Enumerating through torch.arange(torch.min(uniq), torch.max(uniq)) might assume all age label groups are represented and will have a different shape than m.

CACD preprocess script doesn't write csv with only centered entries

The dlib face detector can fail to find only 1 face in an image, in which case no image is saved in the centred image folder. The model code for CACD seems to reference some sort of output csv from this process, but the keep_picture list in preprocess_cacd.py is never actually written anywhere, never mind into a CSV file. Using the existing csv files from this repo fails because some of the images referenced no longer exist after the centering process.

how I use processed datasets about CACD

like the question, I run preprocess-cacd.py file and save the processed picture to CACD2000-centered. and then I change the file path to CACD2000-centered, but when I run the cacd-coral.py ,there was an error saying that it can not find a picture with a name.

AttributeError: Can't get attribute 'AFADDatasetAge' on <module '__mp_main__' from 'afad-coral.py'

predicting single image age demo

when predicting single image age, why are probs so many 1's? Are these numbers really probilities? And why the class label is 20 while prob for class 20 is low?

The download link for the training model in the paper have expired

The download link for the training model in the paper have expired (https://drive.google.com/drive/folders/168ijUQyvGLhHoQUQMlFS2fVt2p5ZV2bD?usp=sharing). Where can I get the training model? Thank you very much.

Face Detection

Which face detection model would you suggest before giving the input to the network? For different types of face detection models, I got different results which is easily predictable since the boundaries are changing.

Please compare with regression network

According to the design of your CORAL framework, it is clear that the output in the penultimate layer, which has only 1 node, is proportional to the age of the input image. i.e. The output in the penultimate layer increases monotonically as the age of the input image. Therefore if the last layer (ordinal regression layer) is removed, this framework becomes a regression network that outputs a number proportional to age the input image. This is just what a regression model does. Hence your framework can be regarded as a regression network plus an ordinal output layer. The advantage is that the outputs of your regression network are in-consistent to the age label.
All in all, it would be great if you compare your framework with a single regression network without the ordinal layer, i.e., remove the last layer of your framework and let the penultimate layer output the age directly. In this case, the outputs are consistent with the age label of course.

THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=11 : invalid argument

While training... I'm using CUDA 10 and pytorch version 1.1.0. Is there ary issue in my side.

why to use separate linear_1_bias

Hi, I did not quite understand why to use separate bias layer.

For example:

in code:
self.fc = nn.Linear(4096, 1, bias=False)
self.linear_1_bias = nn.Parameter(torch.zeros(num_classes-1).float())

except zero initialization, what is the difference from directly using:
self.fc = nn.Linear(4096, 1, bias=True)

Thanks!

Problem prediction higher age values

Hello!

I'm attempting to train the network from scratch using the UTK dataset.

I'm discarding people labeled as younger than 16 or older than 70. The only change I've made in the original script is changing the "NUM_CLASSES" parameter to 55 in order to reflect the age range I'm working with.

Training goes well, and the MSE and MAE are consistent with yours, but when attempting to predict on UTK samples, I realize I'm not able to infer ages past a certain value. In my latest test, for example, while I can get satisfactory results for ages 16-40, I cant get any predictions to go over that (see picture attached)
.

Do you have any insight that might help me? Other than that, congratulations on the paper, It's been helping me a lot

About the monotonicity of the predict layer in the coral network

Hi,
Sorry to bother. I had a problem when understanding the monotonicity of the coral network's predict layer. Here's the only statement I found modifying the bias layer adding to the fc result:
self.linear_1_bias = nn.Parameter(torch.zeros(self.num_classes-1).float())
Is it enough to restrict the monotonicity ? Or is there any other statement restricting the monotonicity of the biases？

The question about fc layer

self.fc = nn.Linear(2048 * block.expansion, 1, bias=False)
why the output of fc layer is 1 other than num_classes?
logits = logits + self.linear_1_bias
one output feature of the fc layer plus num_classes bias?

About coral-cnn model

You used nn.AvgPool2d(7, stride=1, padding=2) at the end of the cnn network, the network input is 120 x 120, the input of pooling layer is 4 x 4, then after the pooling layer to get the output size is 2 x 2, these 4 values are the same. What's the point of this design? Or I've miscalculated it.

Got bad result when implementing Transfer learning on CACD pre-trained model

I tried to fine tune the CACD pre-trained model (e.g : cacd-coral_seed1) on my small dataset (training:121 validation : 15 testing :15) age range: 52~93 , 42 classes. The training result is like this: MAE/RMSE: | Best Train: 1.44/2.77 | Best Valid: 4.00/6.43 | Best Test: 5.07/6.68. The MAE is not bad but when i plot the error(estimated age - real age) v.s real age , i found that the error decreased when the real age increased, meaning that the error and the real age are correlated. I don't think that is normal. I think that indicates the model can't inference properly and just output the same age so the error (estimated age - real age) is positive when the real age is small and negative when the real age is large. I have no idea what kind of problems might cause this. Maybe something went wrong with my code ? or my dataset is just too small to get a good model?

Adjustment in ResNet (freeze all the layers before layer4):

class ResNet(nn.Module):
    def __init__(self, block, layers, num_classes, grayscale):
        self.num_classes = num_classes
        self.inplanes = 64
        if grayscale:
            in_dim = 1
        else:
            in_dim = 3
        super(ResNet, self).__init__()
        self.conv1 = nn.Conv2d(in_dim, 64, kernel_size=7, stride=2, padding=3,
                            bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, 64, layers[0]) # stride預設1 不會改變圖片大小
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
        #self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
        #self.avgpool = nn.AvgPool2d(4)
        #self.fc = nn.Linear(512, 1, bias=False)
        # self.linear_1_bias = nn.Parameter(torch.zeros(self.num_classes-1).float())

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, (2. / n)**.5)
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()

Replace the new layers that i would like to train :

#resnet34 setting
layers=[3, 4, 6, 3]
block=BasicBlock
#model.layer3 = model._make_layer(block, 256, layers[2], stride=2)
model.layer4 = model._make_layer(block, 512, layers[3], stride=2)
model.avgpool = nn.AvgPool2d(4)
model.fc = nn.Linear(512, 1, bias=False)
model.linear_1_bias = nn.Parameter(torch.zeros(NUM_CLASSES-1).float())
for m in model.modules():
    if isinstance(m, nn.Conv2d):
        n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
        m.weight.data.normal_(0, (2. / n)**.5)
    elif isinstance(m, nn.BatchNorm2d):
        m.weight.data.fill_(1)
        m.bias.data.zero_()

model.to(DEVICE)
#print trainable parameters
print("Params to learn:")
feature_extract = True
if feature_extract:
    params_to_update = []
    for name,param in model.named_parameters():
        if param.requires_grad == True:
            params_to_update.append(param)
            print("\t",name)
else:
    for name,param in model.named_parameters():
        if param.requires_grad == True:
            print("\t",name)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

Ordinal Regression Code

Hi!
Niu et al. propose ordinal regression with CNN.
Their loss function has w_t_i, that indicates the weight of the i-th image for the t-th
task.
This parameter from absolute cost matrix (paper formula 2).
Your loss function without this parameter in code. (e.g. afad-ordinal.py)
Can you tall me why?

Prediction on low resolution images

Hello, I'm trying to predict the age of my custom test images with your pretrained model, but it doesn't seem to work well. (I didn't calculated MAE or any other evaluation metrics, but just by seeing the results and comparing to the images, it seems to be so inaccurate and inconsistent; predicting totally different ages for similar images.)
One thing I am concerned is that my test dataset has low resolution. (This is because my images are from low resolution video.) Resolution of the images in the dataset is most likely around 64 x 64.

So here are my questions.

Do you think using low resolution images will give significant decrease in accuracy?
What do you suggest to improve accuracy? (I am thinking about training the model with low-resolution datasets; like 64x64 resized CACD.)
Is CORAL-CNN model sensitive to its trained dataset? Is it better to fine-tune the model or retrain the model if you have a new dataset?

Why the model cannot predict less than 14 year of age using Single Image Predictions?

Its not an issue, more of a query. I see that in the Labels and train/test splits section it is mentioned that

UTKFace: labels 0-39 correspond to ages 21-60

But as we know UTKFace data has samples from age (0 to 116) UTKFACE.

Are these models not trained on the full data-sets because of training resource needed or is there another reason?

diffrent between single-image-prediction and model-code file

if I understand your work, correctly, thrre are no output precision difference between single-image-prediction and model-code is that right?
If so I tried to our data-set to predict using afad-coral.py also card-coral.py in single-image-prediction, however output result shows average 10-20 lower age than actual age.
Do you have any suggestion to improve precision? Thank you in advance.