Giter Club home page Giter Club logo

Comments (15)

neil454 avatar neil454 commented on July 18, 2024 2

from deep-motion.

robertguetzkow avatar robertguetzkow commented on July 18, 2024 1

@neil454

Thanks for the reply! The comment doesn't show up, because I realized my mistake directly after posting and deleted it. Sorry for that.

In case somebody else has this problem: Tensorflow used a different notation in version 0.10 which was equivalent to Theano. So even if you are using Tensorflow v0.10 you need to set the image_dim_ordering to 'th' in Keras' configuration (~/.keras/keras.json) .

from deep-motion.

neil454 avatar neil454 commented on July 18, 2024

I have not provided the actual training data to train the model (which is what X_val_KITTI.npy is). You can download the KITTI dataset with raw_data_downloader.sh. Training the network should not be too hard as long as you understand how to pre-process the input data (look at train.py for details).

For running fps_convert.py, all you need to do is change vid_dir and vid_fn in main(). I could probably make this cleaner with command-line args, but this shouldn't be that hard.

from deep-motion.

GunpowderGuy avatar GunpowderGuy commented on July 18, 2024

i had modified my post already , there were more instructions , but i had ignored them

from deep-motion.

GunpowderGuy avatar GunpowderGuy commented on July 18, 2024

okay , so my first complain is still up in the air ( i will try to use this in another computer , to see if there are truly additional dependencies ) , the second one was my fault , because i dindn't followed all the instructions , and the third one was really a ( minor ) issue
But after all this how to use neural net for images ( not videos ) is not clear to me and i still have this problem with fps_convert.py

Using TensorFlow backend.
warning: GStreamer: unable to query duration of stream (/builddir/build/BUILD/opencv-3.1.0/modules/videoio/src/cap_gstreamer.cpp:832)
Traceback (most recent call last):
File "fps_convert.py", line 89, in
main()
File "fps_convert.py", line 77, in main
vid_arr, fps = load_vid(os.path.join(vid_dir, vid_fn))
File "fps_convert.py", line 31, in load_vid
vid_arr = np.zeros(shape=(num_frames, 128, 384, 3), dtype="uint8")
ValueError: negative dimensions are not allowed

from deep-motion.

neil454 avatar neil454 commented on July 18, 2024

@diegor8 Hmm, for some reason num_frames is a negative value. What type of video are you loading? Also, you may need to install OpenCV with FFMPEG support.

from deep-motion.

GunpowderGuy avatar GunpowderGuy commented on July 18, 2024

i have both opencv and ffmpeg installed , or you mean a special version of opencv ?

from deep-motion.

neil454 avatar neil454 commented on July 18, 2024

I mean before you compile OpenCV, when you configure the cmake build, it should say something like this...

Video I/O:
--     FFMPEG:         YES

If you installed OpenCV through Anaconda or some other method that didn't require compiling from scratch, it probably doesn't come with FFMPEG support.

Anyways, it is pretty easy to test this, just try loading a .mp4 file with opencv...

import cv2

cap = cv2.VideoCapture('vid.mp4')

from deep-motion.

GunpowderGuy avatar GunpowderGuy commented on July 18, 2024

okay , i am already in the process of doing it , ( for the record here is a good tutorial : http://docs.opencv.org/3.1.0/d7/d9f/tutorial_linux_install.html ) the only thing left would be clearer instructions to use this with pictures , use pre trained training,py ( or is the only reason it fails to me , is because even using the pre trained weights requires the kitti dataset ? )

from deep-motion.

neil454 avatar neil454 commented on July 18, 2024

The easiest way to understand how to use this with images is to look at line 56 of fps_convert.py:

pred = model.predict(np.expand_dims(np.transpose(np.concatenate((vid_arr[i-1], vid_arr[i]), axis=2)/255., (2, 0, 1)), axis=0))

Think of vid_arr[i-1] and vid_arr[i] as two consecutive frames/images of shape (128, 384, 3) each.

First, we use np.concatenate to stack the images on top of each other, on their RGB channel dimension, thus making an array of shape (128, 384, 6).

Then, the only pre-processing we do is divide all the image pixel values by 255, to put them in a range from 0.0-1.0.

Finally, to get the input ready for the network, we have to swap the color channel order for use with Theano, so we use np.transpose(X, (2, 0, 1)) to morph the array into a shape of (6, 128, 384) And also use np.expand_dims to add an extra dimension at axis=0, to make it a shape of (1, 6, 128, 384), which means our "batch size" is 1 for this inference step.

Now our input is ready to pass through the network with model.predict to get our output of shape (1, 3, 128, 384) into pred.

To post-process our output, we just do the reverse of the steps above, which I do in line 57:

new_vid_arr.append((np.transpose(pred[0], (1, 2, 0))*255).astype("uint8"))

from deep-motion.

GunpowderGuy avatar GunpowderGuy commented on July 18, 2024

having to analyze a piece of software to know how it works is okay , but from the last issue (Reproduce results) i thought that you had already added more extensive documentation

from deep-motion.

GunpowderGuy avatar GunpowderGuy commented on July 18, 2024

@neil454 what happened with the documentation mentioned in the last issue ? Am i missunderstanding what you were trying to say ? May i compile it myself from all the advice you are giving and the solutions i have come up with to the problems i was having ?

from deep-motion.

robertguetzkow avatar robertguetzkow commented on July 18, 2024

@diegor8

The software works fine when the required dependencies are installed. I think you can close this issue.

from deep-motion.

GunpowderGuy avatar GunpowderGuy commented on July 18, 2024

i will , but i will be redacting a guide and post the compiled opencv binaries , to help new people trying to use deep motion

from deep-motion.

Nandan-M-Hegde avatar Nandan-M-Hegde commented on July 18, 2024

How much time did it take to train and also what were the hardware components used?

from deep-motion.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.