Comments (15)
from deep-motion.
Thanks for the reply! The comment doesn't show up, because I realized my mistake directly after posting and deleted it. Sorry for that.
In case somebody else has this problem: Tensorflow used a different notation in version 0.10 which was equivalent to Theano. So even if you are using Tensorflow v0.10 you need to set the image_dim_ordering
to 'th' in Keras' configuration (~/.keras/keras.json) .
from deep-motion.
I have not provided the actual training data to train the model (which is what X_val_KITTI.npy
is). You can download the KITTI dataset with raw_data_downloader.sh. Training the network should not be too hard as long as you understand how to pre-process the input data (look at train.py for details).
For running fps_convert.py
, all you need to do is change vid_dir
and vid_fn
in main()
. I could probably make this cleaner with command-line args, but this shouldn't be that hard.
from deep-motion.
i had modified my post already , there were more instructions , but i had ignored them
from deep-motion.
okay , so my first complain is still up in the air ( i will try to use this in another computer , to see if there are truly additional dependencies ) , the second one was my fault , because i dindn't followed all the instructions , and the third one was really a ( minor ) issue
But after all this how to use neural net for images ( not videos ) is not clear to me and i still have this problem with fps_convert.py
Using TensorFlow backend.
warning: GStreamer: unable to query duration of stream (/builddir/build/BUILD/opencv-3.1.0/modules/videoio/src/cap_gstreamer.cpp:832)
Traceback (most recent call last):
File "fps_convert.py", line 89, in
main()
File "fps_convert.py", line 77, in main
vid_arr, fps = load_vid(os.path.join(vid_dir, vid_fn))
File "fps_convert.py", line 31, in load_vid
vid_arr = np.zeros(shape=(num_frames, 128, 384, 3), dtype="uint8")
ValueError: negative dimensions are not allowed
from deep-motion.
@diegor8 Hmm, for some reason num_frames
is a negative value. What type of video are you loading? Also, you may need to install OpenCV with FFMPEG support.
from deep-motion.
i have both opencv and ffmpeg installed , or you mean a special version of opencv ?
from deep-motion.
I mean before you compile OpenCV, when you configure the cmake build, it should say something like this...
Video I/O:
-- FFMPEG: YES
If you installed OpenCV through Anaconda or some other method that didn't require compiling from scratch, it probably doesn't come with FFMPEG support.
Anyways, it is pretty easy to test this, just try loading a .mp4 file with opencv...
import cv2
cap = cv2.VideoCapture('vid.mp4')
from deep-motion.
okay , i am already in the process of doing it , ( for the record here is a good tutorial : http://docs.opencv.org/3.1.0/d7/d9f/tutorial_linux_install.html ) the only thing left would be clearer instructions to use this with pictures , use pre trained training,py ( or is the only reason it fails to me , is because even using the pre trained weights requires the kitti dataset ? )
from deep-motion.
The easiest way to understand how to use this with images is to look at line 56 of fps_convert.py
:
pred = model.predict(np.expand_dims(np.transpose(np.concatenate((vid_arr[i-1], vid_arr[i]), axis=2)/255., (2, 0, 1)), axis=0))
Think of vid_arr[i-1]
and vid_arr[i]
as two consecutive frames/images of shape (128, 384, 3)
each.
First, we use np.concatenate
to stack the images on top of each other, on their RGB channel dimension, thus making an array of shape (128, 384, 6)
.
Then, the only pre-processing we do is divide all the image pixel values by 255, to put them in a range from 0.0-1.0.
Finally, to get the input ready for the network, we have to swap the color channel order for use with Theano, so we use np.transpose(X, (2, 0, 1))
to morph the array into a shape of (6, 128, 384)
And also use np.expand_dims
to add an extra dimension at axis=0
, to make it a shape of (1, 6, 128, 384)
, which means our "batch size" is 1 for this inference step.
Now our input is ready to pass through the network with model.predict to get our output of shape (1, 3, 128, 384)
into pred
.
To post-process our output, we just do the reverse of the steps above, which I do in line 57:
new_vid_arr.append((np.transpose(pred[0], (1, 2, 0))*255).astype("uint8"))
from deep-motion.
having to analyze a piece of software to know how it works is okay , but from the last issue (Reproduce results) i thought that you had already added more extensive documentation
from deep-motion.
@neil454 what happened with the documentation mentioned in the last issue ? Am i missunderstanding what you were trying to say ? May i compile it myself from all the advice you are giving and the solutions i have come up with to the problems i was having ?
from deep-motion.
@diegor8
The software works fine when the required dependencies are installed. I think you can close this issue.
from deep-motion.
i will , but i will be redacting a guide and post the compiled opencv binaries , to help new people trying to use deep motion
from deep-motion.
How much time did it take to train and also what were the hardware components used?
from deep-motion.
Related Issues (14)
- Reproduce results HOT 2
- Issue with testing HOT 6
- raw_data_downloader.sh not working HOT 1
- Vid lists in batch_video_download.py nolonger valid
- Unable to download the KITTI dataset
- Training with new dataset
- Archive files for model weights possibly corrupt HOT 10
- Error in get_unet_2 function HOT 3
- Sample video generation HOT 1
- Why the 20:404 offset in the image array? HOT 2
- Problem training HOT 1
- Input data error
- No license specified HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deep-motion.