Giter Club home page Giter Club logo

neuralnetwork-viterbi's People

Contributors

alexanderrichard avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

neuralnetwork-viterbi's Issues

MPII Cooking Dataset

Hi everyone,
I'm trying to use this decoder over MPII Cooking Dataset
This dataset is a lot way bigger than Breakfast dataset:

  • The longest video il 71K frame long
  • The FPS ~30
  • Number of actions is 64
  • Dimension of the features provided is 4000

In order to make training work i modified the parameters in this way:

  • frame sampling = 440 in train.py
  • buffered_frame_ratio = 10 in train.py
    -batch_size = 256 in train.py
  • n_old_frames = min( int(7000), self.buffer.n_frames() ) in network.py
  • window_size = 50 in network.py

After 2 Days and 4 hours still, waiting for the training to happen.
So i've decide to abort.

Now i'm planning to:

  1. Reduce the number of frame per action.
  2. Augmenting the window stride, now window stop over each frame.
  3. Reduce features dimensionality, perhaps creating Fisher Vectors.

I was wondering also if this type of Dataset fits the purpose of this Decoder, or since the action are too long, i should find another way... any ideas, suggestion or critic is really appreciated!

Results on 50 salads

Hello Alex,

I have trained using the 50 salads split 1 for 3K iterations and the test accuracy is 0.386362. Does that look reasonable to you? Or do I have to optimize the parameters (assuming repo parameters are for breakfast data)? Please suggest. Thanks.

Breakfast: frame accuracy: 0.375708

Evaluate 252 video files...
frame accuracy: 0.375708

Hello, I downloaded the dataset you provided and ran your code, but only get 0.375708 on Breakfast dataset. By the way, is it segmentation acc or alignment acc?

Regarding Viterbi decoding

Hi Alex,
I have another two issues regarding the Viterbi decoding.

  1. For the auxiliary function Q(t,l,c.h), you distinguish two cases
    (1) when the frame t stays in the same label, you multiply the function with the frame score of the last label, which is line 90 in utils/viterbi.py
score = hyp.score + self.frame_score(frame_scores, t, label)

(2) when the frame t goes to the next label, you multiply the function with the frame score of the new label, the length model score (Poisson probability of the last segment which just ends) and also the grammar score (just to check whether the new label is a valid successor), as in line 98

score = hyp.score + self.frame_score(frame_scores, t, label) + self.length_model.score(length, label) + self.grammar.score(context, new_label)

Therefore, here the frame score should be self.frame_score(frame_scores, t, new_label) instead of self.frame_score(frame_scores, t, label), right? Please forgive me if I have misunderstood the auxiiiary function.

  1. In the trackback function, I guess you want to add all the missing frames (remainder of sampling of every 30 frames) to the last segment (which is always SIL label 0), as in line 127-128
segments[0].length += n_frames - len(labels) # append length of missing frames
labels += [hyp.traceback.label] * (n_frames - len(labels)) # append labels for missing frames

However, you add the number of missing frames to the length of the first segment and then the labels of missing frames to end of labels. In this case, the returned labels and segments do not correspond to each other in the first and last segment. (This has actually no big effect as segment was never used during the training.)

Regards

features for Hollywood Extended?

Hello,

Can you also provide the extracted features for the Hollywood Extended dataset as you did for the other two datasets?

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.