Giter Club home page Giter Club logo

metadengue's Introduction

MetaDengue

MetaDengue is a unified dataset format that combines satellite imagery and Socioeconomical and environmental metadata.

DengueSet seeks to organize satellite imagery and metadata using a unified and standard dataset, that can be scalable and reproduceable to any geographical area. For this particular project, we have extracted the satellite imagery of the municialities of Colombia from 2016 to 2018 based on the Epiweek using satellite extractor, a proposal based on SentinelHub, and with this proposal we combine Socioeconomical and environmental metadata in JSON files, one for each corresponding image so that the data is temporally and spacially aligned.

Find the full open-opensource datasets in HuggingFace: Link

Updates

  • build_dataset_adapted.py is designed for the version 2.0 from the dataset. The name of the images contain the numeric code of each municipality instead of the prefix "image"

Proposed Structure

The following is the desired structured that we propose for DengueSet. There are 2 main folders, one called images which contains a subfolder for each corresponding municipality and inside of it, all the temporal satellite imagery for each corresponding municipality. On the second folder, the same structure as the previous one, but on each case, containing the corresponding metadata on JSON files.

DATASET/ 
	images/
		5001/
                  images_01_01_2016.tiff
                  images_01_07_2016.tiff
                  .
                  .
		5002/
    .
    .
		500N/
	annotations/
		5001/
                  images_01_01_2016.json
                  images_01_07_2016.json
		      .
                  .
		5002
    .
    .
		500N

Create metadata dataset.

In order to create a customized dataset, please update config.py with the corresponding metadata and adapt build_dataset.py as required. For DengueSet case, please run build_dataset.py. Afterwards, make sure the images folder is stored on the same tree hierarchy as the annotations/ folder.

Metadata organization:

{
      "image_path": "DATASET/images/23001/image_2016-01-03.tiff",
      "municipality_code": 23001,
      "epiweek": 201601,
      "dynamic": {
            "cases": {
                  "dengue_cases": 31,
                  "binary_classification": 1,
                  "multiclass_classification": 0
            },
            "environmental_data":{
                  "temperature": [
                        30.703164269355987
                  ],
                  "precipitation": [
                        0.1678038044754522
                  ]
            }
      },
      "static": {
            "environmental_data": {
                  "elevation": 36.0
            },
            "socio_economic_demographic_data": {
                  "socio_economic_data":{
                        "Secondary/HigherEducation(%)": 59.93,
                        "Employedpopulation(%)": 36.46,
                        "Unemployedpopulation(%)": 4.61,
                        "Peopledoinghousework(%)": 17.02,
                        "Householdswithoutwateraccess(%)": 10.07,
                        "Householdswithoutinternetaccess(%)": 52.84,
                        "Buildingstratification1(%)": 60.2975,
                        "Buildingstratification2(%)": 14.4266,
                        "Buildingstratification3(%)": 6.6937,
                        "Buildingstratification4(%)": 2.3756,
                        "Buildingstratification5(%)": 0.8777,
                        "Buildingstratification6(%)": 0.7190000000000001,
                        "NumberofhospitalsperKm2": 0.087552,
                        "NumberofhousesperKm2": 37.051894
                  },
                  "socio_demographic_data":{
                        "Age0-4(%)": 7.76,
                        "Age5-14(%)": 26.31,
                        "Age>30(%)": 48.69,
                        "Men(%)": 48.51,
                        "Women(%)": 51.49,
                        "Population per year": 471724,
                        "AfrocolombianPopulation(%)": 1.7,
                        "IndianPopulation(%)": 0.71,
                        "Peoplewhocannotreadorwrite(%)": 5.93,
                        "PeoplewithDisabilities(%)": 2.69
                  }
            }
      }
}

Download DengueSet Augmented data with aligned metadata [Download]:

Google CLoud Platform bucket storing 10 top cities. . Data extracted using recursive artifact removal, cloud removal based on LeastCC, and Nearest Interpolation for spatial resolution. Implemented [here] and augmentations applied to RGB channels while leaving other satellite channels unchanged:

  1. Contrast limited adaptive histogram equalization (CLAHE) - using clip_limit=6.0 and tile_grid_size=(16, 16)
  2. RGBShift (applied to 30 pixels per channel with 100 % probability )
  3. RandomBrightnessContrast (applied with a probability of 50% probability)

Dataloder demos

Define custom dataloders in dataloders/

Pytorch implementation [Notebeook]

  1. Custom dataloader to load all folders within DATASET/images folder. [Here]
  2. Custom dataloder to load filtered folders within DATASET/image folder. [Here]

Tensorflow implementation [Notebook]

  1. Custom dataloader to load all folders within DATASET/images folder. [Here]
  2. Custom dataloder to load filtered folders within DATASET/image folder. [Here]

metadengue's People

Contributors

mitcriticaldatacolombia avatar sebasmos avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.