Giter Club home page Giter Club logo

metashape's Introduction

‼️ NOTE: Active development is now happening at https://github.com/open-forest-observatory/automate-metashape. This repository is no longer actively maintained. ‼️

Easy, reproducible Metashape workflows

A tool to make it easy to run reproducible, automated, documented Metashape photogrammetry workflows in batch on individual computers or as parallel jobs on a compute cluster. No coding knowledge required.

Setup

Python: You need Python (3.6, 3.7, or 3.8). We recommend the Anaconda distribution because it includes all the required libraries. When installing, if asked whether the installer should initialize Anaconda3, say "yes". Anaconda must be initialized upon install such that python can be called from the command line. A way to check is to simply enter python at your command prompt and see if the resulting header info includes Anaconda and Python 3. If it doesn't, you may still need to initialize your Conda install. Alternative option: If you want a minimal python installation (such as if you're installing on a computing cluster), you can install miniconda instead. After intalling miniconda, you will need to install additional packages required by our scripts (currently only PyYAML) using pip install {package_name}.

Metashape: You must install the Metashape Python 3 module (Metashape version 2.0). Download the current .whl file and install it following these instructions (using the name of the .whl file that you downloaded).

Metashape license: You need a license (and associated license file) for Metashape. The easiest way to get the license file (assuming you own a license) is by installing the Metashape Professional Edition GUI software (distinct from the Python module) and registering it following the prompts in the software (note you need to purchase a license first). UC Davis users, inquire over the geospatial listserv or the #sptial Slack channel for information on joining a floating license pool. Once you have a license file (whether a node-locked or floating license), you need to set the agisoft_LICENSE environment variable (search onilne for instructions for your OS; look for how to permanently set it) to the path to the folder containing the license file (metashape.lic).

Reproducible workflow scripts: Simply clone this repository to your machine!

Usage

The general command line call to run the worflow has three components:

  1. Call to Python
  2. Path to metashape workflow Python script (metashape_workflow.py)
  3. Path to workflow configuration file (*.yml)

For example:

python {repo_path}/python/metashape_workflow.py {config_path}/{config_file}.yml

All processing parameters are specified in the .yml config file. There is an example config file in the repo at config/example.yml. Details on the config file are below.

Organizing raw imagery (and associated files) for processing

Images should be organized such that there is one root level that contains all the photos from the flight mission to be processed (these photos may optionally be organized within sub-folders), and no other missions. If the workflow is to include spectral calibration, ground control points (GCPs), and/or a USGS DEM, this root-level folder must also contain a corresponding folder for each. For example:

mission001_photos
├───100MEDIA
|       DJI_0001.JPG
|       DJI_0002.JPG
|       ...
├───101MEDIA
|       DJI_0001.JPG
|       DJI_0002.JPG
|       ...
├───102MEDIA
|       DJI_0001.JPG
|       DJI_0002.JPG
|       ...
├───gcps
|       ...
├───dem_usgs
|       dem_usgs.tif
└───calibration
        RP04-1923118-OB.csv

The namings for the ancillary data folders (gcps, dem_usgs, and calibration) must exactly match these if they are to be a part of the workflow.

A sample RGB photo dataset (which includes GCPs and a USGS DEM) may be downloaded here (1.5 GB). Note this dataset has sparse photos (low overlap), so photogrammetry results are unimpressive.

The location of the raw imagery folder is specified in the configuration file passed to the metashape workflow script (see next section).

Workflow configuration

All of the parameters defining the Metashape workflow are specified in the configuration file (a YAML-format file). This includes directories of input and output files, workflow steps to include, quality settings, and many other parameters.

An example configuration file is provided in this repo at config/example.yml. The file contains comments explaining the purpose of each customizable parameter. To prepare a customized workflow, copy the config/example.yml file to a new location, edit the parameter values to meet your specifications, save it, and then run the metashape workflow from the command line as described above, passing it the location of the customized configuration file. Do not remove or add parameters to the configuration file; adding will have no effect unless the Python code is changed along with the addition, and removing will produce errors.

The workflow configuration is saved in a procesing log at the end of a workflow run (see below).

Batch workflow configuration

If you wish to run multiple iterations of a processing workflow with small differences between each, you can specify a "base" configuration YAML file that specifies the processing parameters that all runs will have in common, plus a "derived" configuration file that specifies how each individual run's parameters should differ from the base parameters. For an example, see config/base.yml (identical to the example.yml) and config/derived.yml. For each run, the derived YAML only needs to include the parameters that differ from the base parameters. Each separate run in the derived YAML should be given a name and surrounded by #### on each end (see example derived.yml). Then, use the R script R/prep_configs.R to generate a full YAML config file for each run. As arguments to the call to this R script, supply (1) the path to the directory containing the base and derived YAML config files, and (2) the path to the metashape_workflow.py script. The prep_configs.R script will create a full YAML file for each run, as well as a shell file that calls the Metashape workflow scripts once for each configuration (each run). All you have to do to execute all these runs in series is to call this automatically generated shell script.

Workflow outputs

The outputs of the workflow are the following:

  • Photogrammetry outputs (e.g., dense point cloud, orthomosaic, digital surface model, and Metashape processing report)
  • A Metashape project file (for additional future processing or for inspecting the data via the Metashape GUI)
  • A processing log (which records the processing time for each step and the full set of configuration parameters, for reproducibility)

The outputs for a given workflow run are named using the following convention: {run_name}_{date_and_time}_abc.xyz. For example: set14-highQuality_20200118T1022_ortho.tif. The run name and output directories are specified in the configuration file.

Running workflow batches in serial on a single computer

Running workflows in batch (i.e., multiple workflows in series) on a single computer is as simple as creating configuration file for each workflow run and calling the Python workflow script once for each. The calls can be combined into a shell script. The shell script might look like the following (note the only thing that changes is the name of the config file):

python ~/repos/metashape/python/metashape_workflow.py ~/projects/forest_structure/metashape_configs/config001.yml
python ~/repos/metashape/python/metashape_workflow.py ~/projects/forest_structure/metashape_configs/config002.yml
python ~/repos/metashape/python/metashape_workflow.py ~/projects/forest_structure/metashape_configs/config003.yml

Then it's just a matter of running the shell script.

Running workflow batches in parallel on a compute cluster

Running Metashape workflow batches in parallel on a cluster is as simple as submitting multiple jobs to the cluster. Submitting a job simply involves instructing the cluster to run the metashape_workflow.py script with the specified configuration file.

Example for the farm cluster (UC Davis College of Agricultural and Environmental Sciences)

You will need to install the Metashape python module into your user account on farm following the Setup instructions above (including the isntructions related to the Metashape license). This is easiest if you first install Miniconda and install Metashape (along with PyYAML) there.

Next you need to create a shell script that will set up the appropriate environment variables and then call python to execute the metashape_workflow.py file with a provided config file (save as farm_python.sh):

#!/bin/bash -l
source ~/.bashrc

# Write the hostname to the processing log
hostname -f

# Set ENV variable to a specific font so reports work
export QT_QPA_FONTDIR='/usr/share/fonts/truetype/dejavu/'

# Run the workflow
# First arg is the Metashape python workflow script,
# Second arg is the config file
python ${1} ${2}

Finally, to submit a Metashape job, you would run something like the following line:

sbatch -p bigmemh --time=24:00:00 --job-name=MetaDemo -c 64 --mem=128G shell/farm_python.sh python/metashape_workflow.py config/example.yml

The meanings of the sbatch parameters are explained in the linked resources above. Once you have submitted one job using the sbatch command, you can submit another so that they run in parallel (assuming your user group has sufficient resource allocation on farm). You can also put multiple sbatch commands into a shell script so that you only have to run the shell script.

Preparing ground-control points (GCPs)

Because the workflow implemented here is completely GUI-free, it is necessary to prepare GCPs in advance. The process of preparing GCPs involves recording (a) the geospatial location of the GCPs on the earth and (b) the location of the GCPs within the photos in which they appear.

Metashape requires this information in a very specific format, so this repository includes an R script to assist in producing the necessary files based on more human-readable input. The helper script is R/prep_gcps.R.

GCP processing input files. Example GCP input files are included in the example RGB photo dataset under gcps/raw/. The files are the following:

  • gcps.gpkg: A geopackage (shapefile-like GIS format) containing the locations of each GCP on the earth. Must include an integer column called gcp_id that gives each GCP a unique integer ID number.
  • gcp_imagecoords.csv: A CSV table identifying the locations of the GCPs within raw drone images. Each GCP should be located in at least 5 images (ideally more). The tabls must contain the following columns:
    • gcp: the integer ID number of the GCP (to match the ID number in gcps.gpkg)
    • folder: the integer number of the subfolder in which the raw drone image is located. For example, if the image is in 100MEDIA, the value that should be recorded is 100.
    • image: the ingeter number of the image in which the GCP is to be identified. For example, if the image is named DJI_0077.JPG, the value that should be recorded is 77.
    • x and y: the coordinates of the pixel in the image where the GCP is located. x and y are in units of pixels right and down (respectively) from the upper-left corner.

These two files must be in gcps/raw/ at the top level of the flight mission directory (where the subfolders of images also reside). Identification of the image pixel coordinates where the GCPs are located is easy using the info tool in QGIS.

Running the script. You must have R and the following packages installed: sf, raster, dplyr, stringr, magick, ggplot2. The R bin directory must be in your system path, or you'll need to use the full path to R. You run the script from the command line by calling Rscript --vanilla with the helper script and passing the location of the top-level mission imagery folder (which contains the gcp folder) as an argument. For example, on Windows:

Rscript --vanilla {path_to_repo}/R/prep_gcps.R {path_to_imagery_storage}/sample_rgb_photoset

Outputs. The script will create a prepared directory within the gcps folder containing the two files used by Metashape: gcp_table.csv, which contains the geospatial coordinates of the GCPs on the earth, and gcp_imagecoords_table.csv, which contains the pixel coordinates of the GCPs within each image. It also outputs a PDF called gcp_qaqc.pdf, which shows the specified location of each GCP in each image in order to quality-control the location data. If left in this folder structure (gcps/prepared), the Metashape workflow script will be able to find and incorporate the GCP data if GCPs are enabled in the configuration file.

metashape's People

Contributors

kant avatar mallikanocco avatar spatialhast avatar wildintellect avatar youngdjn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metashape's Issues

Implement YAML configuration files

This is to allow easy specification of alternate documented configurations for Metashape runs

  • YAML template (as placeholder/template for actual configs)
  • Script that defines Metashape parameter defaults, and reads YAML to replace defaults
  • Script should create an importable object containing the parameters and their values
  • Script should output parameters to processing log

Implement master metashape control script

Script should be passed a path to raw drone images and a YAML configuration file. The script will run a single metashape project from start to finish.

  • Script will run a Metashape pipeline including loading images, aligning photos, and computing and exporting any outputs as specified in the YAML, using the parameters defined in the YAML.
  • Script should save benchmarking data (timings per step, hardware specs, etc.)
  • Detect and use all GPUs available, otherwise all CPUs available

Move GPU Enabling to a subscript

The code to check for GPUs, report their names and enable if available or configured can be it's own script.

Configuration should include
GPU_enable = True/False
GPU_num = numeric (number of GPUs to use if available)

Allow partial metashape pipelines and resumption of previously-run projects

--This would allow a user to run downstream processing (e.g., orthoimage generation) using existing upstream data outputs (e.g., dense point cloud or aligned photos). Thus, a user could produce, for example, multiple orthoimages (each with different stitching parameters) without having to repeat the photo alignment for each one.

--Need to determine if this can be built into the master control script and YAML config, or whether it is more sensible to code separately.

--Need to determine a good way to keep track of Metashape project files so that they can be reliably re-located when they are needed for resuming an existing project/pipeline.

--One option is to use the same master control script and YAML config, and allow the flexibility to specify which components of the pipeline to run and also to specify an existing project to load if it exists.

--If a project is resumed, need a way to load its processing parameters (e.g., of dense cloud generation) so that they can be added to the log that is output for the new (downstream) partial processing components (e.g., orthoimage generation).

--If a project is resumed, need to save any modifications (due to additional downstream processing) as a new project rather than modifying the existing project file. This will allow the original project file to be re-used repeatedly.

Implement batch job creation script

Script will create a series of shell commands, each one calling the master metashape control script with a specific photo directory and YAML config file. Detect whether on cluster or local machine and create shell commands appropriately. Potentially create one folder for all outputs of the batch.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.