pytorch-lightning-smoke-detection

Created: 2021 Anshuman Dewangan

This repository supports both image classification and object detection models for wildfire smoke detection for the publication: FIgLib & SmokeyNet: Dataset and Deep Learning Model for Real-Time Wildland Fire Smoke Detection.

Please include the following citation in your work:

Dewangan, A.; Pande, Y.; Braun, H.-W.; Vernon, F.; Perez, I.; Altintas, I.; Cottrell, G.W.; Nguyen, M.H. FIgLib & SmokeyNet: Dataset and Deep Learning Model for Real-Time Wildland Fire Smoke Detection. Remote Sens. 2022, 14, 1007. https://doi.org/10.3390/rs14041007

Visualization of model performance:

Data

Data Locations

Relevant Files:

./scripts/setup_files.sh: Copies raw data and labels into home directory for faster data loading.
./src/dynamic_dataloader.py: Includes datamodule and dataloader for training
./data/metadata.pkl: Dictionary generated by DynamicDataModule.prepare_data() that includes helpful information about the data. See dynamic_dataloader.py for the full list of keys.

Relevant Directories - Raw Data:

/userdata/kerasData/data/new_data/raw_images: location of raw images
/userdata/kerasData/data/new_data/raw_images_flow: location of optical flow images
/userdata/kerasData/data/new_data/raw_images_mog: location of images processed with MOG background removal

Relevant Directories - Labels:

/userdata/kerasData/data/new_data/drive_clone: location of raw XML labels
/userdata/kerasData/data/new_data/drive_clone_numpy: location of preprocessed labels created by ./scripts/prepare_data.sh
/userdata/kerasData/data/new_data/bbox_labels.csv: csv file containing bounding box labels for all the images
/userdata/kerasData/data/new_data/drive_clone_bbox: bbox labels for all images as .npy files
/userdata/kerasData/data/new_data/drive_clone_filled_bbox: image mask with bboxes filled as 1s

Relevant Directories - Train/Test Splits:

./data/final_split/: data split where train = all the labeled fires and val/test is a random split of unlabeled fires (with night fires removed).
./data/split1/ and ./data/split2/: random train/val/test split of all labeled fires only

Relevant Directories - Data Cleaning:

./data/mislabeled_fires.txt: list of fires that should be thrown out because their binary labels are erroneous (ie. ground truth says there isn't a fire when there actually is)
./data/night_fires.txt: list of fires that occur during the night (so they can be removed)
./data/omit_mislabeled.txt: list of images that are supposed to be labeled but do not have bbox labels

metadata.pkl

metadata.pkl is a key file containing a dictionary that is generated by prepare_data.py to assist in the data loading process. The keys of the dictionary are:

fire_to_images (dict): dictionary with fires as keys and list of corresponding images as values
omit_no_xml (list of str): list of images that erroneously do not have XML files for labels. Does not include unlabeled fires.
omit_no_contour (list of str): list of images that erroneously do not have loaded contours for labels. Does not include unlabeled fires.
omit_no_contour_or_bbox (list of str): list of images that erroneously do not have contours or bboxes. Does not include unlabeled fires.
omit_mislabeled (list of str): list of images that erroneously have no XML files and are manually selected as mislabeled. Does not include unlabeled fires.
monochrome_fires (list of str): list of fires that are monochrome
night_fires (list of str): list of fires that are in nighttime
mislabeled_fires (list of str): list of fires in which the ground truth has erroneous labels and thus should be removed
labeled_fires (list of str): list of fires that have at least some labels
unlabeled_fires (list of str): list of fires that have not been labelled at all
train_only_fires (list of str): list of fires that should only be used for train (not 'mobo-c')
eligible_fires (list of str): list of fires that can be used for test or train (not in train_only_fires)
bbox_labels (dict): dictionary with images as keys and 4-element array of bounding box coordinates as values

Data Setup from Scratch

Should you lose prior data or receive new data, use the following steps to prepare the data prior to model training:

Run ./scripts/download_raw_data.sh to download raw images from the HPWREN website to /userdata/kerasData/data/new_data/raw_images_new/ directory
Follow the instructions at the bottom of this Google Doc to download the bounding box and contour annotation labels (Note: to be released to the public in late 2022)
Run python3.9 ./scripts/prepare_data.py to create ./data/metadata_new.pkl and .npy label files in /userdata/kerasData/data/new_data/drive_clone_numpy_new/
(Optional) Run python3.9 ./scripts/generate_flow.py to create optical flow outputs in /userdata/kerasData/data/new_data/raw_images_flow_new/ and background removal outputs in /userdata/kerasData/data/new_data/raw_images_mog_new/

Generating train/val/test split:

From labeled_fires, remove night_fires and mislabeled_fires. Use as the train set.
From unlabeled_fires, split 50/50 between validation and test sets.

Model

Model Setup

Relevant Files:

model_components.py: Different loss functions and torch models to use with main_model.py. Each model has its own forward pass.
main_model.py: Main model to use with lightning_module.py. Chains forward passes and sums loss functions from individual model_components

Models: Models are created with model_components that can be chained together using the --model-type-list command line argument. Intermediate supervision from tile_labels or image_labels provides additional feedback to each model_component. Models can be one of five types:

RawToTile: Raw inputs -> tile predictions
RawToImage: Raw inputs -> image predictions
TileToTile: Tile predictions -> tile predictions
TileToImage: Tile predictins -> image predictions
ImageToImage: Image predictions -> image predictions

Special models include:

Feature Pyramid Networks
Backbones incorporating optical flow
Object detection models

Training

Relevant Files:

main.py: Kicks off training and evaluation. Contains many command line arguments to vary hyperparameters.
lightning_module.py: PyTorch Lightning LightningModule that defines optimizers, training step and metrics.
run_train.sh: Used to easily start training from main.py with command line arguments.

Steps to Run: To run training, use ./run_train.sh. You can check main.py for a full list of tunable hyperparameters as command line arguments.

Logging

Relevant Directories:

./lightning_logs/ (currently not pushed to repo): Automatically generated each run where logs & checkpoints are saved
./saved_logs/ (currently not pushed to repo): It is suggested to move logs you want to save long-term in this directory

Steps to Access: Logs can be accessed using Tensorboard: tensorboard --logdir ./lightning_logs

Other Stuff

Useful Scripts:

./scripts/paper_experiments.sh: kicks off all experiments useful for a research paper (different backbones, ablation study, object detection models)
./scripts/labelme.sh: sets up LabelMe for generating additional annotations. Only runs partial setup; look at code within file for complete instructions.

Utility Notebooks:

helper.ipynb: code to average test metrics, calculate inference speed, and debug code.
visual_analysis.ipynb: code to visualize errors, generate videos, and create human experiment.

Troubleshooting

On Debian, change "torch_gtrxl" to "gtrxl_torch"
If the download_raw_data.sh script is blocked by robots.txt, add: -e robots=off

License

This repository is released under the Apache 2.0 license. Please see the LICENSE file for more information.

shreyasar2202 / smoke-detection-smokeynet Goto Github PK

smoke-detection-smokeynet's Introduction

pytorch-lightning-smoke-detection

Data

Data Locations

metadata.pkl

Data Setup from Scratch

Model

Model Setup

Training

Logging

Other Stuff

Troubleshooting

License

smoke-detection-smokeynet's People

Contributors

Stargazers

Watchers

Forkers

smoke-detection-smokeynet's Issues

Recommend Projects

Recommend Topics

Recommend Org