songqi-github / attanet Goto Github PK
View Code? Open in Web Editor NEWAttaNet for real-time semantic segmentation.
AttaNet for real-time semantic segmentation.
您好,我想了解一下 70.1% mIoU 的精度的输入大小是512 * 1024还是 1024 * 2048?
看文章中说速度的测试是先resize为512 * 1024测试,然后resize回原尺寸;但是看代码里,对于精度的测试默认直接用scale=1.0测试, 也就是用1024 * 2048的尺度测试了精度,这个我有点困惑,感觉速度和精度的测试并不match, 辛苦解答一下哈~
Can I give a code for the FPs test, I can't achieve the Fps use my own code
I‘ve down the dataset here https://www.cityscapes-dataset.com/file-handling/?packageID=3 leftImg8bit_trainvaltest.zip (11GB) [md5] left 8-bit images - train, val, and test sets (5000 images)
The dataset is three folders as above,but your code mentioned /gtFine/train folder,it seems that it doesn't belong any folder(train/val/test).Could you explain for us that how did this folder come about and what does it do?Thanks!
@songqi-github
Your work shows potential, though somethings require clarification:
Hi there,
Can AttaNet work with high accuracy when having only 2 classes for segmentation?
@songqi-github
Have you considered to Quantize
the ResNet18 model > Training
the model > Pruning
the trained model > Finetuning
?
i believe that the final ResNet18 model could achieve much more efficiency and fps, while maintaining almost same accuracy.
when I loaded the pretrained model ,'resnet18-5c106cde.pth', there is a error during the training. As showed below.
Missing key(s) in state_dict: "head.resnet.conv1.weight", "head.resnet.bn1.weight", "head.resnet.bn1.bias", "head.resnet.bn1.running_mean", "head.resnet.bn1.running_var",.......
Unexpected key(s) in state_dict: "conv1.weight", "bn1.running_mean", "bn1.running_var", "bn1.weight", "bn1.bias", "layer1.0.conv1.weight", "layer1.0.bn1.running_mean",........
Do you know how to fix it?
Hi, it's a good work in the real-time semantic segmentation.But, there is no training file here. could you upload it in your free time? Besides, there is another question. what does the meaning of "The training settings require 8 GPU with at least 11GB memory."?
Is there has a python file for the ade20k dataset except for the cityscapes?
Hi there,
when is it expected the release of the code
作者你好,关于速度测试我有一些疑问,能不能认为你在测速的时候先把图片放缩到更小的resolution,如512x1024,然后把放缩后的图片输入到模型里面去,再将输出放缩回1024x2048?如果真是如此的话,那你此时用的测速resolution不能声称为1024x2048,而是512x1024。要不你就测精度的时候相应地如此操作,否则无法确保公平。
Hi, thanks for your great work AttaNet and I'm pretty interested in your research.
After reading the papers and reviewing the code, I'm confused about the inference speed and evaluation results of the method.
AttaNet is tested with the input size 512x1024 and achieves 130 FPS with ResNet-18 backbone while the mIoU is evaluated with crop_eval
and flip test
see:
Line 59 in 32fd818
crop and flip
while the inference time is measured by a single 512x1024 input. The inference time and the evaluated results might not be consistent.The results of your experiment left a deep impression on me. This is a very good job. Can you upload your visualization file,including AFM,SAM and the predicting results?
请问一下,测试test的时候是下采样成512*1024送入模型的吗,是否像您的evalute那样crop成了(1024,1024),又是否在测试的时候使用了多尺度等增点手段?
When I was looking at the SAM part of the code and the paper, I found that the operations of Q and K do not correspond to each other. If I just swap the names of Q and K in the code, it is still not correct because the subsequent transpose operation is still performed by Q.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.