samuelyu2002 / imvisible Goto Github PK

ImVisible: Pedestrian Traffic Light (PTL) Dataset, Lightweight CNN (LytNet), and Mobile Application for the Visually Impaired (CAIP '19, ICCV Workshops '19)

License: MIT License

Python 60.61% Swift 39.39%

python swift traffic-light image-classification regression deep-learning neural-network image-dataset zebra-crossings

imvisible's Issues

关于数据集咨询

作者您好，我下载了您上传的数据集768576与876657,但是没看到里面有标签文件，请问下标签文件在哪下载啊。> <

Expected input[32, 768, 3, 576] to have 3 channels, but got 768 channels instead.

Hi, I just set the correct paths to the annotations and to your datasets, I run the 'training.py' locally and I got the following error:

RuntimeError: Given groups=1, weight of size 32 3 3 3, expected input[32, 768, 3, 576] to have 3 channels, but got 768 channels instead

Also a warning:

UserWarning: nn.init.xavier_normal is now deprecated in favor of nn.init.xavier_normal_.

LYTNETV2 not working, while LYTNET yes

With the same files, the training.py crashes only when uses LytnetV2, with the following error:

RuntimeError: Given input size: (960x9x12). Calculated output size: (960x0x1). Output size is too small

Clarification on Labeling Resolution for Coordinate Values

Thanks for the dataset!
I have noticed that the values for x1, y1, x2, y2 are not in normalized form.
I would like to utilize these labels for smaller image resolutions for my custom model.

Therefore, I would appreciate it if you could specify the resolution that is taken into account when labeling, so I can proceed with remapping.

Once again, Thank you for your work!

@samuelyu2002

what is input of model?

import torch
import torch.nn as nn
from LYTNet import LYTNet
from LYTNetV2 import LYTNetV2

from torch.utils.data import DataLoader
from dataset import TrafficLightDataset

MODEL_PATH = './LytNetV1_weights'
device = torch.device('cpu')
model=LYTNet()
model.load_state_dict(torch.load(MODEL_PATH,map_location=device))
model.eval()

test_file_loc = './traffic/testing_file.csv'
test_image_directory = './traffic/PTL_Dataset_768x576'

import numpy as np
from PIL import Image
size=(768,576)
im = Image.open('./traffic/PTL_Dataset_768x576/john_IMG_0671.jpg' )
#im = pilimg.open('./traffic/PTL_Dataset_768x576/heon_IMG_0776.jpg' )

im=im.resize(size)
im.show()

pix = np.array(im)
pix=torch.Tensor(pix).type(torch.FloatTensor)
#print(pix.shape)

pix=pix.unsqueeze(0)
pix=pix.view([1,-1,576,768])
#print(pix.shape)

pred_classes, pred_direc = model(pix)
_, predicted = torch.max(pred_classes, 1)
print(predicted)

It works and output was "tensor([4])" .
But when I put green light image, it says it's "tensor([4])" almost every green light images.
I think it had problem on input parameter.
Please help..

using model input parameter issue

import torch
import torch.nn as nn
from LYTNet import LYTNet
from LYTNetV2 import LYTNetV2

from torch.utils.data import DataLoader
from dataset import TrafficLightDataset

MODEL_PATH = './LytNetV1_weights'
device = torch.device('cpu')
model=LYTNet()
model.load_state_dict(torch.load(MODEL_PATH,map_location=device))
model.eval()

test_file_loc = './traffic/testing_file.csv'
test_image_directory = './traffic/PTL_Dataset_768x576'

import numpy as np
from PIL import Image
size=(768,576)
im = Image.open('./traffic/PTL_Dataset_768x576/john_IMG_0671.jpg' )
#im = pilimg.open('./traffic/PTL_Dataset_768x576/heon_IMG_0776.jpg' )

im=im.resize(size)
im.show()

pix = np.array(im)
pix=torch.Tensor(pix).type(torch.FloatTensor)
#print(pix.shape)

pix=pix.unsqueeze(0)
pix=pix.view([1,-1,576,768])
#print(pix.shape)

pred_classes, pred_direc = model(pix)
_, predicted = torch.max(pred_classes, 1)
print(predicted)

It works and output was "tensor([4])" .
But when I put green light image, it says it's "tensor([4])" almost every green light images.
I think it had problem on input parameter.
Please help..

Hello,
For this kind of application, I have searched for a long while and the implementation is very good! Now, the Pedestrian-Traffic-Light (PTL)-dataset you mention in the Readme is not in the repository and I also can't find it anywhere else on the internet. I'd like to train some other models with it to see how it performs.
Can you please tell me how to get the images of the dataset? Thanks!

From zebra crossing line to traffic light position

Hello,

We're working on changing the labels to get the prediction of the position of the traffic light, instead of the points for the zebra crossing prediction. We finished labeling and we have ascertained that the new coordinates are good. The new (x1,y1,x2,y2) refers to the upper-left angle of the traffic light box (p1) and the bottom-right angle of the traffic light box (p2), so that to get a bounding box.

Problem with Lytnet: the precisions of classes remain always [0.30, 0.29, 0, 0, 0], even after the 600th epoch. Do you have any suggestion to how set the coordinates to predict the position of the traffic light with Lytnet?

Thank you very much in advance

How to use your model for real time image

Hi guys! First of all, Amazing work! I am currently working on a project that used Jetson nano and CSI camera. It's a detector meant to help visually impaired person. I am currently trying your model but I am troubling making it to read real time footage . Can you guys tell me what to search for or what to learn? Thanks!

Is there a requirements.txt?

Hi. I have some issues to run training with latest version of Pytorch.
I have no time to fix they, so I wanna just downgrade the version, but I don't know what version I should use.

samuelyu2002 / imvisible Goto Github PK

imvisible's Issues

关于数据集咨询

Expected input[32, 768, 3, 576] to have 3 channels, but got 768 channels instead.

LYTNETV2 not working, while LYTNET yes

Clarification on Labeling Resolution for Coordinate Values

what is input of model?

using model input parameter issue

Where can I get the dataset?

From zebra crossing line to traffic light position

How to use your model for real time image

Is there a requirements.txt?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent