Giter Club home page Giter Club logo

solar-farms-mapping's Introduction

An Artificial Intelligence Dataset for Solar Energy Locations in India

The majority of recent studies use either predictions of resource suitability or databases that are either developed thru crowdsourcing that often have significant sampling biases or have time lags between when projects are permitted and when location data becomes available. Here, we address this shortcoming by developing a spatially explicit machine learning model to map utility-scale solar projects across India using freely available satellite imagery. Model predictions were validated by human experts to obtain a total of 1363 solar PV farms. Using these outputs, we provide a cumulative measure of the solar footprint across India and quantified the degree of land modification associated with land cover types that may cause conflicts.

Figure 1: Solar PV mapping pipeline. An unsupervised model is train using a human in the loop approach to cluster solar farms pixels together and obtain weak semantic labels. These labels paired with geo-located Sentinel 2 imagery is used to create a semantic segmentation dataset suitable to train a supervised segmentation neural network. False positive predictions obtained from performing inference in novel scenes are considered hard negative and are used to augment the training dataset.

If you make use of this implementation or dataset in your own project or want to refer to it in a scientific publication, please consider referencing this GitHub repository and citing our paper:

@article{ortiz2022Solar,
  title   = {An Artificial Intelligence Dataset for Solar Energy Locations in India},
  author  = {Anthony Ortiz and Dhaval Negandhi and Sagar R Mysorekar and Shivaprakash K Nagaraju and Joseph Kiesecker  and Caleb Robinson and Aditi Khurana and Felipe Oviedo and Juan M. Lavista Ferres},
  journal = {arXiv preprint arXiv:2202.01340},
  year    = {2022},
}

Figure 2: Examples of solar farms predictions from our model for different areas in India.

Dataset

Our solar farm dataset is available in this repo under the data folder. The file "solar_farms_india_2021.geojson" includes the raw geometries for individual polygons. The file "solar_farms_india_2021_merged.geojson" contains the individual oplygons cluster into solar pv farms based on proximity. The file "solar_farms_india_2021_merged_simplified.geojson" simplifies the geometry of the solar farms for easier use.

The dataset includes the following variables:

  • fid: Unique identifier for a solar farm. It is shared by polygons belonging to the same farm.
  • Area: The area of the site in squared meters (m2).
  • Latitude: Latitude corresponding to the center point of the solar installation.
  • Longitude: Longitude corresponding to the center point of the solar installation.
  • State: Indian State where the solar PV installation is located at.

License for Dataset

Copyright (c) Microsoft Corporation. All rights reserved.

The dataset is open for anyone to use under the CDLA-Permissive-2.0 license.

Project setup instructions

  • Open a terminal on the machine
  • Install conda (note: if you are using a DSVM on Azure then you can skip this step as conda is preinstalled!)

Install Anaconda

cd ~
wget https://repo.anaconda.com/archive/Anaconda3-2019.07-Linux-x86_64.sh
bash Anaconda3-2019.07-Linux-x86_64.sh # select "yes" for setting up conda init
rm Anaconda3-2019.07-Linux-x86_64.sh

## logout and log back in
exit
  • Install NVIDIA drivers if you intend on using GPUs; note this might require a reboot (note: again, if you are using a DSVM on a Azure GPU VM then this is also handled)
  • Setup the repository and install the demo data
# Get the project and demo project data
git clone https://github.com/microsoft/solar-farms-mapping.git

# install the conda environment
# Note: if using a DSVM on Azure, as of 7/6/2020 you need to first run `sudo chown -R $USER /anaconda/`
# make sure conda-forge and pytorch are in the conda channel

conda config --append channels pytorch
conda config --append channels conda-forge

# create precise solar farms environment
cd solar-farms-mapping
conda env create --file environment_precise.yaml
cd ..

Summary Software Dependencies

๐Ÿ Python 3.x

  • torch==1.4.0
  • torchvision==0.5.0
  • tifffile==2020.2.16
  • pyarrow==0.16.0
  • imutils==0.5.3
  • scikit-image==0.16.2
  • scikit-learn==0.22.2
  • tqdm==4.46.1
  • imageio=2.9.0
  • matplotlib==3.2.2
  • shapely
  • scipy>=1.6.0
  • fiona
  • geopandas==0.9.0
  • opencv-python==4.5.1.48
  • pillow==8.1.2
  • rasterio==1.2.0
  • seaborn==0.11.1

Install Dependencies with Conda

conda create --name solar python=3.X
conda activate solar
conda install pytorch torchvision cudatoolkit=XX.X -c pytorch

install dependencis from previous list

Note: Make sure to intall a version of cudatoolkit supported by your NVCC version. Our code was tested with Python 3.6 and cudatoolkit 11.0

Training and doing inference

Download Solar Farms Pretrained Model

You can download a pretrained solar farm model in the following link:

Note: Before running the example notebooks place the checkpoint.pth.tar file in the following directory: data/models/unet_hnm/

Training Solar Farm Segmentation Model

To train the network with default values and see training output on the command line run:

python3 train.py --data_dir=/dir_to_datasets_directory/ --verbose

Note: For training customized model check options available in the options directory

Inference

Use the "tile_inference.py" script to do inference on image tiles. "inference_large_scale.py" lets you do inference for a list of tiles. Explore the demos uner the notebooks directory for inference examples.

Demo Notebooks

We include two jupyter notebook demos under the notebooks directory. Feel free to explore it to get familiar with the code base. The "Inference Demo" give you examples of using one of our models to preddict solar farms from Sentinel 2 Imagery. The "Built Date Estimation and LULC study Demo" lets you get temporal S2 imagery from the Microsoft' planetary computer matching Solar Farms geometries, estimate when the solar farms were built and find the land covel/ land use classes that where substitute by solar farms.

Web Visualizer

Visist our web visualizer to interact with our data and visualize the simplified version of the dataset and download the data interactively.

Contribute

You can contribute to the project by submitting bugs and feature requests.

License for Code

Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT license.

solar-farms-mapping's People

Contributors

anthonymlortiz avatar calebrob6 avatar microsoft-github-operations[bot] avatar microsoft-github-policy-service[bot] avatar microsoftopensource avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

solar-farms-mapping's Issues

Queries regarding inference demo

Hi Guys,

I have been trying to test your model on Sentinel Files I have downloaded for India. The files which I have got are in .jp2 format. One jp2 file for each sentinel band for each tile. However, the input to your model is in the form of 2020_merged.tif. The neural network takes a 12 channel input rather than a 3 channel RGB input. Can you please tell me how you have created the file 2020_merged.tif ? Also, how did you guys arrive at the mean and std arrays? They contain 12 values each.

Thank you and Regards
Srinivas

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.