Giter Club home page Giter Club logo

roof_detex's Introduction

! WORK IN PROGRESS !

Codebase for training models to detect structures from very high resolution satellite images (~0.4m).

data preprocessing

scripts to preprocess the data into images and lables are in scripts/. Each set of images and labels used for training come in different format so it requires custom preprocessing. All the different data sources are converted to 256x256 3 RGB images. 4 scripts are developed to handle 4 different data streams:

  • giveDirectly_dataprep.py, to prepare images and labels provided by the people behind this paper. 1468 google maps images, 400 by 400, zoom level = 16. Between 0 and 24 huts per image, with an average between 5 and 6.
  • rms_dataprep.py to prepare proprietary satellite and labels genrated from VAM's geospatial team. These are very high resolution RGB images labelled from the remote monitoring team at WFP. 1 channel containing the mask (1 for roof, 0 for no roof)
  • dstl_dataprep.py to prepare images and labels from the DSTL Kaggle competition.
  • spacenet_dataprep.py to prepare images and labels fof the RGB Khartoum image set from Spacenet.

The labels for each dataset change slightly: for VAM data huts are identified of one pixel for where the roof is, i.e. 1 pixel per roof. To reduce class imbalance between roofs and not roofs, a buffer is added around the single pixels, between 3x3 to 9x9 depending on image. So the final mask is a square that idially overlaps with the roof. One of the big improvements to do here.

Training

batch_training.py trains the netowrk in batches using a generator to load the images in memory at run-time. use python batch_training.py --help to get avaialble parameters, including directory of the training data, what weights to use and what images' names to use.

src/

number_of_islands.py class to count islands in boolean 2D matrix.
unet.py the network architecture, UNet. utils all the handy stuff, inclusing cv and IO routines. Loss funciton, Dice Coefficient, is defined here too.

Infrastructure

AWS g2.2xlarge

Results:

On VAM's data, the Dice coefficient on the validation set saturates around 0.35 (they are noisy labels, not a bad score if you look at predictions).

roof_detex's People

Contributors

lorenzori avatar

Stargazers

 avatar  avatar

Watchers

James Cloos avatar Bora Erden avatar Gaurav Singhal avatar  avatar  avatar

roof_detex's Issues

new architecture

currently do image segmentation, however the real objective is to count how many huts. SO maybe a different architecture may be more suited? a con + dens?

better buffer for masks

current images have only 1 pixel that label whether there is a hut or not. I currently place an arbitrary buffer around the single pixel to try an cover for the whole roof section. This works but is not ideal.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.