shauray8 / calib-challenge Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 361.67 MB

Predicting Yaw and Pitch of a moving vehicle (comma ai) including more stuff that can help make your car drive itself.

Home Page: https://shauray8.github.io/Calib-Challenge/

License: MIT License

Python 73.01% Shell 8.28% C++ 4.29% Cuda 14.11% HTML 0.32%

convnets machinelearning selfdrivingcar deep-learning

calib-challenge's Introduction

Commaai's Calib-Challange

Goal

the goal is to predict the direction of travel (in camera frame) from provided dashcam video. (yaw and pitch, fortunately, no roll)

Commaai's repo provides 10 videos. Every video is 1min long and 20 fps.
5 videos are labelled with a 2D array describing the direction of travel at every frame of the video with a pitch and yaw angle in radians.
5 videos are unlabeled. It is your task to generate the labels for them.
The example labels are generated using a Neural Network, and the labels were confirmed with a SLAM algorithm.
You can estimate the focal length to be 910 pixels.

Extending the goals and making myself a functional device that can meke my car drive itself.
what are the things I have to consider when writing code (still thinking)
So its a lot of classes I have to think some clever way to get it down.

Evaluation

They will evaluate our mean squared error against our ground truth labels. Errors for frames where the car speed is less than 4m/s will be ignored. Those are also labelled as NaN in the example labels.

commaai's repo includes an eval script that will give an error score (lower is better). You can use it to test your solutions against the labelled examples. They will use this script to evaluate your solution.

Architecture

I am thinking of using some kind of optical flow model and rather than doing some kind of image stabilization or something I'll make it yield Yaw and Pitch of the moving vehicle.

Adding details about the architecture soon!

1. FlowNetCorr

I'm gonna keep it short, sweet and to the point
So the architecture used was taken from this reseach paper it's ConvNets again !! predicting stuff like optical flows is not easy and surely you can not do it with a single input image.

A straightforward step is to create two separate, yet identical processing streams for the two adjacent frames and to combine them at a later stage (after 3 convs in this case).

In the research paper to concatenate the outputs for the convnets, they used "CORRelation layer" but I don't think it makes a lot of difference.

after a bunch of ConvNets, it goes through a refinement layer the output for the above architecture is the input for the refinement layer!

This pretty much summarizes the architecture and at the end rather than implementing the last layer I make the matrix pass through a Linear layer and predict yaw and pitch with ONE HOT vector kinda thing. If you have a better idea for the ONE HOT vector alternative just let me know !!

2. Global Motion Aggregation

3. MarkFlowNet --> no implimentation

4. FlowNet2.0 --> no implimentation

Navigation

Labelled dataset [by comma]
Unlabeled test dataset [by comma]
Eval script [by comma]
Models and training script
Setup
Pretrained weights
what the user sees (software)
what the user sees (webpage)
segmentation

ToDo

I have to deploy it and retrain it on new data and keep on doing that !
for now I'm not doing it in real time or with time i will make this thing work with carla

I'm so lazy to compelete the code. If there is anyone to compelete it for me go on !!

How to tinker/use the code?

you can monitor the training process with tensorboard:

tensorboard --port=PORT --logdir=pretrained

pretrained model is a little too heavy for github, uploading on google drive

https://drive.google.com/file/d/1kxpD8DmL-CQIB02zxah_-BIoM6spcBJF/view?usp=sharing

training script for flownetCorr is here

  - python train_flownetcorr --help (for all the arguments and folder locations)
  - the training loop is in the 'train' function.
  - the validation loop is in the 'validation' function.
  - there are relevent comments before every piece of code so it is not that tough to identify and change stuff.
  - it uses MSE loss that is the squared of the mean of the losses through the batchsize.

FlownetCorr model is here

adding soon be patient!

example of how opensource is changing the world !!

comma ai

Recommend Projects

shauray8 / calib-challenge Goto Github PK

calib-challenge's Introduction

Commaai's Calib-Challange

Goal

Evaluation

Architecture

1. FlowNetCorr

2. Global Motion Aggregation

3. MarkFlowNet --> no implimentation

4. FlowNet2.0 --> no implimentation

Navigation

ToDo

How to tinker/use the code?

example of how opensource is changing the world !!

calib-challenge's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent