Comments (10)
Yes, the Center Branch uses the focal loss and can handle multi-label classification.
from moc-detector.
I have some questions related to flip_test mode.
-
In "normal_moc_det.py"/preprocess(), line 62, why do you convert the red channel of "flip_data". What does this mean?
temp[:, :, 2] = 255 - temp[:, :, 2] -
In "normal_moc_det.py"/process() function, why don't you take the average of rgb_mov and rgb_mov_f (as well as flow_mov and flow_mov_f) like heatmap and wh output (lines 88,89, 100,101) ?
-
rgb_output[1]['mov'], flow_output[1]['mov'] are computed for nothing?
It's the same for stream_moc_det.py. I hope to get your explanation. Thank you for your reply.
from moc-detector.
-
This is a specific channel for brox-flow rather than the red channel.
-
I tried, but this had no use.
-
Yes.
from moc-detector.
Thank you for your response. I have another question. In fact, the format of ground-truth tubes in "UCF101v2-GT.pkl" is as follows:
gttubes = {
'parentfolder/videoname': {class: [
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]])
...
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]]) ]}
...
'parentfolder/videoname': {class: [
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]])
...
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]]) ]}
}
Here, the datasets are single-object? Each video contain only one action? And the class/identification of tubes is the class of videos (or the index of the parent folder's name)?
In theory, your model is multi-object tracking, but it is trained by single-object data?
How about the general problem where there are multiple objects or multiple actions of different types in videos? For example:
- video with 2 people jumping -> need to identify, or need to separate the tube boxes of each one
- video with one jumping, one walking -> need to classify as normal
In this case, there exists only the class/identification for tubes, not for videos? And _gttubes[label] must be a dictionary of multiple elements, as follows?
gttubes = {
'parentfolder/videoname': { class: [
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]])
...
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]]) ]
class: [
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]])
...
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]])]
...}
...
'parentfolder/videoname': { class: [
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]])
...
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]]) ]
class: [
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]])
...
array([[frame,x1,y1,x2,y2],...,[frame,x1,y1,x2,y2]])]
...}
}
Many thanks,
from moc-detector.
Oh, flow-images are represented by HSV format, where the 0-channel means the direction and the 2-channel means the magnitude of the movement? So, when we flip images, we have to flip the direction of object movement. Thanks for the information, I forgot that.
from moc-detector.
UCF101-24 is a multi-objects dataset but JHMDB-21 is a single-object dataset. (see our gifs)
According to my observation, both datasets are single-action as you declare.
I don't know the generalization performance for multi-actions. And indeed, the community demand for a newly large non-atomic multi-actions/multi-objects action detection dataset.
from moc-detector.
I checked in UCF101v2-GT.pkl and found that UCF101-24 is not only a single-action but also single-object dataset. In every video, only one object is annotated with box during the video (dispite the video may contain many objects). So, UCF101-24 is a single-object tracking dataset.
We have
len(self._gttubes[v]) = 1 for every video v in in self._gttubes
The action tube can be interrupted, or it is divided in many segments. For example:
'Basketball/v_Basketball_g18_c02': {0: [array([
[ 1., 161., 137., 222., 235.],
[ 2., 161., 137., 222., 235.],
[ 3., 161., 137., 222., 235.],
[ 4., 161., 137., 222., 235.],
[ 5., 161., 137., 222., 235.],
[ 6., 161., 137., 222., 235.],
[ 7., 161., 137., 222., 235.],
[ 8., 161., 137., 222., 235.],
[ 9., 161., 137., 222., 235.],
[ 10., 162., 137., 223., 235.],
[ 11., 162., 137., 223., 235.],
[ 12., 163., 137., 224., 235.],
[ 13., 163., 137., 224., 235.],
[ 14., 163., 137., 224., 235.],
[ 15., 163., 137., 224., 235.],
[ 16., 163., 137., 224., 235.],
[ 17., 163., 137., 224., 235.],
[ 18., 163., 137., 224., 235.],
[ 19., 163., 137., 224., 235.],
[ 20., 163., 137., 224., 235.]], dtype=float32), array([[ 72., 163., 146., 219., 238.],
[ 73., 163., 146., 219., 238.],
[ 74., 163., 146., 219., 238.],
[ 75., 163., 146., 219., 238.],
[ 76., 163., 146., 219., 238.],
[ 77., 163., 146., 219., 238.],
[ 78., 163., 146., 219., 238.],
[ 79., 163., 146., 219., 238.],
[ 80., 163., 146., 219., 238.],
[ 81., 163., 146., 219., 238.],
[ 82., 163., 146., 219., 238.],
[ 83., 163., 146., 219., 238.],
[ 84., 163., 146., 219., 238.],
[ 85., 163., 146., 219., 238.],
[ 86., 163., 146., 219., 238.],
[ 87., 163., 146., 219., 238.],
[ 88., 163., 146., 219., 238.],
[ 89., 163., 146., 219., 238.],
[ 90., 163., 146., 219., 238.],
[ 91., 163., 146., 219., 238.],
[ 92., 163., 146., 219., 238.],
[ 93., 163., 146., 219., 238.],
[ 94., 163., 146., 219., 238.],
[ 95., 163., 146., 219., 238.],
[ 96., 163., 146., 219., 238.],
[ 97., 163., 146., 219., 238.],
[ 98., 163., 146., 219., 238.],
[ 99., 163., 146., 219., 238.],
[100., 163., 146., 219., 238.],
[101., 163., 146., 219., 238.],
[102., 163., 146., 219., 238.]], dtype=float32)]}
There are two tube segments in "Basketball/v_Basketball_g18_c02" video. However, the objects in the tubes are the same.
So, the UCF101-24 is is single-object tracking dataset.
from moc-detector.
gttubes
: dictionary that contains the gt tubes for each video.
Gttubes are dictionaries that associate from each index of label, a list of tubes.
A tube is a numpy array with nframes rows and 5 columns, .
len(self._gttubes[v]) = 1
represents single-action rather than single-object.
And try to check len(self._gttubes[v][class_index])
For example, len(pkl['gttubes']['Fencing/v_Fencing_g04_c03'][6])
---> 4
from moc-detector.
Thank you very much for this example. I'm wrong. I drawn the boxes for Basketball/v_Basketball_g18_c02 and think that it's the same for other videos. Thanks again.
from moc-detector.
Hi just found this issue. Can the proposed method support multi person multi action in one frame? such as in ava dataset?
from moc-detector.
Related Issues (20)
- the standard of pkl HOT 4
- FLOPs analysis HOT 3
- Confused about the `sample_cuboids` constraints HOT 2
- Question about `down_ratio` HOT 4
- model for K7 RGB + FLOW COCO on UCF101-24
- How to pretrain an optical flow model on COCO? HOT 7
- train on the one gpu error HOT 9
- Import Error HOT 13
- Reproduction of paper results HOT 1
- how to infer with only rgb model? HOT 1
- training problem HOT 4
- about branch construction HOT 2
- Some questions about spatial-temporal action detection HOT 1
- more details about 3D CNN
- Inference with flow model HOT 2
- DCNv2 HOT 5
- Disable cudnn batch normalization HOT 1
- Why ninput = 5 when input modality is flow? HOT 1
- How to interpret the 3-channel in jpg image of optical flow?
- undefined symbol: __cudaRegisterFatBinaryEnd. HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from moc-detector.