Giter Club home page Giter Club logo

Comments (6)

nathanb97 avatar nathanb97 commented on July 17, 2024 2

thank you for your reply, but it's already done
After several tests I always get a maximum of 0.75 auc
This seems to be explained by the fact that we do not have the same preprocessing
Can you tell how you did the preprocessing before I3d.
Because it seems that in the i3d repo that you have advised the tensors that pass in i3d be normalized between -1 and 1 on the kinetics dataset.

mean = [114.75, 114.75, 114.75]
std = [57.375, 57.375, 57.375]
With a normalization made on the kinetics dataset.
class GroupNormalize (object):
    def __call __ (self, tensor): # (T, 3, 224, 224)
        for b in range (tensor.size (0)):
            for t, m, s in zip (tensor [b], self.mean, self.std):
                t.sub_ (m) .div_ (s)
        return tensor

Have you performed a standardization on UCF-crime or have you kept the standardization of kinetics?
Was the normalization between 0 and 1 or -1 and 1?

I did this tenCrop without normalization (only normalized between 0 and 1) and obtained a maximum result of 0.75:

crop10 = transforms.Compose ([
    transforms.Resize (256),
    transforms.TenCrop (256), # this is a list of PIL Images
    transforms.Lambda (lambda crops: torch.stack ([transforms.ToTensor () (crop) for crop in crops])) # returns a 4D tensor
    #optional uncoment this line: transforms.Normalize (mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225])

 ])

Please can you specify exactly how you do the tenCrop on an image ?

from rtfm.

tianyu0207 avatar tianyu0207 commented on July 17, 2024

Hello! First, congratulaton for the excelent paper.

What are the excat changes to make to train your model on Ucf-Crime, with your code.

To what I noted in the paper and in the various issues:

  • assign args.batch_size = 32 (When we concatenate we get a batch size of 64. Moreover when I set to 64 (128) the results do not increase)
  • weight_decay = 0.0005
  • and in the dataset.py:
 if self.is_normal:
     self.list = self.list [810:]
 else:
     self.list = self.list [: 810]

Is that all?

Because I do not achieve the same performance even after these changes.
I get a maximum of :

  • auc: 0.75
  • pr_auc: 0.18588291392503292

Hi The only thing needs to change from the GitHub is the normal and abnormal videos in the dataset.py. You should get around 84% AUC with only this change. Thanks.

if self.is_normal:
self.list = self.list [810:]
else:
self.list = self.list [: 810]

from rtfm.

nathanb97 avatar nathanb97 commented on July 17, 2024

today I tested, a normalization with GroupNormalize and value
mean = [94.9191, 93.7068, 92.1115]
var = [39.12006135, 38.95593793, 39.355997]

Here are some results between my preprocess and your preprocess

norm between preprocess i3d : 583.2708129882812
their score  max: 0.975659191608429 mean : 0.6715472936630249
our score:   max: 0.9387404322624207 mean : 0.6065698266029358



norm between preprocess i3d : 516.7003784179688
their score  max: 0.8348086476325989 mean : 0.4239957332611084
our score:   max: 0.8746155500411987 mean : 0.6011001467704773



norm between preprocess i3d : 710.51416015625
their score  max: 0.9999951124191284 mean : 0.3108385503292084
our score:   max: 0.8753393292427063 mean : 0.5222576260566711

from rtfm.

tianyu0207 avatar tianyu0207 commented on July 17, 2024

today I tested, a normalization with GroupNormalize and value
mean = [94.9191, 93.7068, 92.1115]
var = [39.12006135, 38.95593793, 39.355997]

Here are some results between my preprocess and your preprocess

norm between preprocess i3d : 583.2708129882812
their score  max: 0.975659191608429 mean : 0.6715472936630249
our score:   max: 0.9387404322624207 mean : 0.6065698266029358



norm between preprocess i3d : 516.7003784179688
their score  max: 0.8348086476325989 mean : 0.4239957332611084
our score:   max: 0.8746155500411987 mean : 0.6011001467704773



norm between preprocess i3d : 710.51416015625
their score  max: 0.9999951124191284 mean : 0.3108385503292084
our score:   max: 0.8753393292427063 mean : 0.5222576260566711

Below is my feature extraction setup.

mean = [114.75, 114.75, 114.75]
std = [57.375, 57.375, 57.375]

split == '10_crop_ucf':
transform = transforms.Compose([
gtransforms.GroupResize(256),
gtransforms.GroupTenCrop(224),
gtransforms.ten_crop_ToTensor(),
gtransforms.GroupNormalize_ten_crop(mean, std),
gtransforms.LoopPad(max_len),
])

class GroupTenCrop(object):
def init(self, size):
transform = torchvision.transforms.Compose([
torchvision.transforms.TenCrop(size),
torchvision.transforms.Lambda(lambda crops: torch.stack([torchvision.transforms.ToTensor()(crop) for crop in crops])),
])
self.worker = transform
def call(self, img_group):
return [self.worker(img) for img in img_group]

class ToTensor(object):
def init(self):
self.worker = lambda x: F.to_tensor(x) * 255
def call(self, img_group):
img_group = [self.worker(img) for img in img_group]
return torch.stack(img_group, 0)

from rtfm.

daviduarte avatar daviduarte commented on July 17, 2024

Same problem here. I used pre computed I3D. I run 3 times (50 epochs each), and got the following AUC in test set (testing each 5 epochs):

0.791 (in epoch 50)
0.780 (in epoch 10)
0.80 (in epoch 10)

from rtfm.

coranholmes avatar coranholmes commented on July 17, 2024

Hi I have got the same issue, is it possible for any of you to share the codes for preprocessing 10crop?

from rtfm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.