Giter Club home page Giter Club logo

win_det_heatmaps's Introduction

Window Detection in Facades Using Heatmaps Fusion

Official implementation of our paper.

Chuan-Kang Li, Hong-Xin Zhang, Jia-Xin Liu, Yuan-Qing Zhang, Shan-Chen Zou, Yu-Tong Fang. Window Detection in Facades Using Heatmap Fusion[J].Journal of Computer Science and Technology, 2020, 35(4): 900-912.

Introduction

Window detection is a key component in many graphics and vision applications related to 3D city modeling and scene visualization. We present a novel approach for learning to recognize windows in a colored facade image. Rather than predicting bounding boxes or performing facade segmentation, our system locates keypoints of windows, and learns keypoint relationships to group them together into windows. A further module provides extra recognizable information at the window center. Locations and relationships of keypoints are encoded in different types of heatmaps, which are learned in an end-to-end network. We have also constructed a facade dataset with 3418 annotated images to facilitate research in this field. It has richly varying facade structure, occlusion, lighting conditions, and angle of view. On our dataset, our method achieves precision of 91.4% and recall of 91.0% under 50% IoU. We also make a quantitative comparison with state-of-the-art methods to verify the utility of our proposed method. Applications based on our window detector are also demonstrated, such as window blending.

image

Preparation

Environment

Please install PyTorch following the official webite. In addition, you have to install other necessary dependencies.

pip3 install -r requirements.txt

Dataset

The zju_facade_jcst2020 database is described in the paper, and now avaliable on BaiduYun(code: qlx5), GoogleDrive

Facade images were collected from the Internet and existing datasets including TSG-20, TSG-60, ZuBuD, CMP, ECP, and then data cleaning proceeded to ensure data quality standards. Using the open source software LabelMe, we manually annotated the positions of four corners of windows in order.

Model

You can use our trained models from BaiduYun(code: n0ev), GoogleDrive. ResNet18, MobileNetV2, ShuffleNetV2 are provided. All the configurations are written in *.yaml files and config_pytorch.py, and you can change it up to your own needs. NOTE: model filename is ended with .tar but isn't a compressed file. Just ignore the postfix and load ckpt directly.

The table concludes the performance of three models on our i7-6700K + 1080Ti platform. Note that center verification module is not used.

Architecture #Params FLOPs Time P_50 P_75 P_mean R_50 R_75 R_mean
ShuffleNetV2 + Head 13.8M 29.5G 62ms 85.2% 62.8% 54.9% 86.2% 63.5% 55.5%
MobileNetV2 + Head 16.9M 31.2G 65ms 87.0% 64.9% 56.8% 90.0% 67.1% 58.5%
ResNet18 + Head 19.6M 32.0G 62ms 88.4% 68.4% 58.7% 91.2% 70.5% 60.5%

Usage

Train

python train.py --cfg /path/to/yaml/config \
    --data /path/to/data/root \
    --out /path/to/output/root

Test

python test.py --cfg /path/to/yaml/config --model /path/to/model \
    --data /path/to/data/root \
    --out /path/to/output/root

Inference

python infer.py --cfg /path/to/yaml/config \
                --model /path/to/model \
                --infer /path/to/image/directory

Examples

Applications

Facade Unification

We have developed a computational workflow for window texture blending based on our window detection method. Based on our technique, graphics designer can easily manipulate facade photos to create ideal building textures, while removing windows which are unsatisfactory due to their open or closed status, lighting conditions and occlusion, replacing them with the selected unified window texture.

Facade Beautification

Applying the above workflow, image beautification can be also performed to generate visually pleasant results with mixed features.

Facade Analytics

As our method can efficiently locate windows in urban facade images, it is of use for automatically analyzing semantic structure and extracting numerical information. With additional simple steps, it is easy to determine the windows in a single row or column. Furthermore, it can be adopted to predict building layers and symmetric feature lines.

Citation

If our code/dataset/models/paper helps your research, please cite with:

@article{Chuan-Kang Li:900, 
    author = {Chuan-Kang Li, Hong-Xin Zhang, Jia-Xin Liu, Yuan-Qing Zhang, Shan-Chen Zou, Yu-Tong Fang},
    title = {Window Detection in Facades Using Heatmap Fusion},
    publisher = {Journal of Computer Science and Technology},
    year = {2020},
    journal = {Journal of Computer Science and Technology},
    volume = {35},
    number = {4},
    eid = {900},
    numpages = {12},
    pages = {900},
    keywords = {facade parsing;window detection;keypoint localization},
    url = {http://jcst.ict.ac.cn/EN/abstract/article_2660.shtml},
    doi = {10.1007/s11390-020-0253-4}
}    

Acknowledgement

The major contributors of this repository include Chuankang Li, Yuanqing Zhang, Shanchen Zou, and Hongxin Zhang.

win_det_heatmaps's People

Contributors

lck1201 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

win_det_heatmaps's Issues

Memory Issue

Hi I'm getting the error:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 13.61 GiB (GPU 0; 10.76 GiB total capacity; 4.73 Gib already allocated; 4.09 GiB free; 5.12 GiB reserved in total by PyTorch) If reserved memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Batch Size 4 and 2080Ti

AssertionError: (0, IndexError('too many indices for array: array is 1-dimensional, but 2 were indexed'), '0914D92ED9B1A4ECD498DD1E89BA43B1(1).jpg')

Hello, your work is great! It helps me a lot!
But when I run
python infer.py --cfg experiments/resnet/lr1e-3_x120-90-110_center_b2.yaml --model model/resnet18_model_latest.pth.tar --infer ./image
It raise an Error like that
AssertionError: (0, IndexError('too many indices for array: array is 1-dimensional, but 2 were indexed'), '0914D92ED9B1A4ECD498DD1E89BA43B1(1).jpg')
which happens in "\win_det_heatmaps\common_pytorch\net_modules.py" file of line 215.
By the way ,0914D92ED9B1A4ECD498DD1E89BA43B1(1).jpg is contained in the "image" directory.
Thanks a lot!

CUDA unavailable

Hi! My GPU is unfortunately not by nvidia and so its not possible to compile torch with cuda. Is there any alternative code for normal gpus?
thx in advance

trainning error

Hello, thank you for your work.

When I train my own dataset, the following errors are reported in the verification phase. Can you help me?

in valid
in eval
Param Tag_threshold 12.0
Param detection_threshold 0.2
Traceback (most recent call last):
File "/mnt/Code/win_det_heatmaps/common_pytorch/net_modules.py", line 152, in evalNet
rectify = test_config.rectify, winScoreThres = test_config.windowT)
File "/mnt/Code/win_det_heatmaps/common_pytorch/group/tag_group.py", line 266, in group_corners_on_tags
grouped = parser.parse(np.float32(dets), np.float32(tags), idx, ratio, rectify) # shape=(num_of_windows, 4, 4)
File "/mnt/Code/win_det_heatmaps/common_pytorch/group/tag_group.py", line 242, in parse
re = self.calc(det, tag, idx)
File "/mnt/Code/win_det_heatmaps/common_pytorch/group/tag_group.py", line 178, in calc
val_k = [c_pts[:, 2, np.newaxis] for c_pts in coords_in_patch_with_score_id]
File "/mnt/Code/win_det_heatmaps/common_pytorch/group/tag_group.py", line 178, in
val_k = [c_pts[:, 2, np.newaxis] for c_pts in coords_in_patch_with_score_id]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 158, in
main()
File "train.py", line 135, in main
config.train.patch_width, config.train.patch_height, final_output_path)
File "/mnt/Code/win_det_heatmaps/common_pytorch/net_modules.py", line 155, in evalNet
assert 0, (n_s, e, os.path.basename(imdb_list[n_s]['image']))
AssertionError: (0, IndexError('too many indices for array: array is 1-dimensional, but 2 were indexed'), 'result_toushi_2282_6_4.jpg')

Config file missing

Hi

Could you share on the config yaml file required to perform the train/test/infer? thank you

Training crashes at 118 epochs

Hello, when I run the training script and use the training data you provided, the script crashes at 118 epochs with the error message of "Assertion error: (1, index error('too many indices for array: array is 1-dimensional but 2 were indexed') as seen below. Can you please tell me what is the cause of this and how it can be fixed?
Thank you
error

dataset size

Hi, thanks for this project - it is a fantstic resource.

When downloading the dataset from here the zju_facade_jcst2020.rar (md5 1d2f3fbfb913f6120698929547110c7f) file behaves like a zip bomb on my system (ubuntu).

The 1Gb archive extracts to 750Gb+ folder structure containing a single file zju_facade_jcst2020/TRAIN/JSON/annotation/00000.json. Is this intentional? Such a large json file seems problematic.

What size should the extracted dataset be?

Many thanks!

windows version

Hello, your work works quite fine in my ubuntu system but I wonder if it can be run on my windows system? Thank you, you did a great job.

Facade Analytics Code

I see that you have this item on the last Applications example. I tried to find the code for it but without any success. Is it possible for you to kindly point me towards it?

System can not find the route

Hello, I want to ask you a question. When I used the texst.py file for testing, the following error occurred
[WinError 3] The system cannot find the path specified. :'d\_valid_cache'
which is occurred in "win_det_heatmaps-master\common_pytorch\dataset\imdb.py", line 32, in cache_path
os.mkdir(cache_path)"
I would like to ask you how to solve this problem?
Thank you so much!

model .tar google drive

Are the .tar model files good? I cannot untar them. They seem corrupted. Could you upload them without taring?

Errors in loading staring dict:

RuntimeError: Error(s) in loading state_dict for PoseNet_2branch:
Missing key(s) in state_dict: "backbone.conv1.weight", "backbone.bn1.weight", "backbone.bn1.bias", "backbone.bn1.running_mean", "backbone.bn1.running_var", "backbone.layer1.0.conv1.weight", "backbone.layer1.0.bn1.weight", "backbone.layer1.0.bn1.bias", "backbone.layer1.0.bn1.running_mean",

my command line was: python infer.py --cfg experiments/resnet/lr1e-3_x120-90-110_center_b2.yaml --model models/resnet18_model_latest.pth.tar --infer images So I am loading resnet18_model the configs from the .yaml file, and the model. Seems the definition or dictaionary does not contain those strings.

Perspective transformation code

I see the page includes a description on carrying out perspective transformation of the facade/window but I can't find that in the code. Can you please let us know where this could be located?

Thanks

help running

Hi, I am not really sure how I should run your project. I tried downloading the resnet file but I am unable to extract it, and not really sure how to use it. Could I have a more detailed explenation of how to train with the resnet option? how should I treat the downloaded file and which are the paths that I need to specify when I try to run it?
Thank you for this great project

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.