smnorris / becmodel Goto Github PK

View Code? Open in Web Editor NEW

0.0 3.0 1.0 20.64 MB

Modelling tool for generating BC Biogeoclimatic Ecosystem Classification (BEC) polygons

License: Apache License 2.0

Python 98.52% Shell 1.48%

british-columbia bec ecosystems biogeoclimatic-zones

becmodel's People

Contributors

Watchers

Forkers

flnro-smithers-research

becmodel's Issues

ad hoc parsing beclabel strings by length is error prone

when looking for 'high' elevation in relation to woodland, we look at the beclabel string:
https://github.com/smnorris/becmodel/blob/master/becmodel/main.py#L252

woodland[0] = `ESSFxcw` 
# high code to be found is `ESSFxc 1`
high = woodland[0][:-1] # this depends on non-padded beclabel strings

Just reworking to ensure beclabel strings are always padded will work but for easier maintenance it is probably a good idea to parse the string explicitly ahead of time to derive zone, subzone, variant, phase and work with those.

cross-rule poly high elevation noise removal

Aggregating alpine/parkland/woodland across rule polys should actually be pretty straightforward - as per the individual rule poly logic:

build lookup mapping all becvalues to alpine/parkland/woodland/high
loop through the mapping
- extract alpine/etc
- remove small holes
- apply removed holes to output image, ensuring to replace with the appropriate value for the given rule poly

trim trailing spaces from output BGC_LABEL values

current status:

BGC_LABEL,AREA_HECTARES
IMA un   ,308.2
IMA un   ,122.0
ESSFmmp  ,197.8
IMA un   ,1374.8

accept excel files as inputs

elevation and becmaster can be loaded from excel if we add a worksheet param or similar to the config

unexpected small "high" areas in output

Three cells of ESSFwc 3 are introduced along rule polygon edge in the high elevation filter step as shown in image below.

This is because of two issues:

the majority filter expands the "high" zone in the adjacent rule polygon very slightly, crossing rule polygon boundaries
when applying the high elevation filter, we aggregate all "high" areas and assign these in the output image based on the particular "high" value in the rule polygon (we don't just fill the holes)

Fixing 2 (only fill holes, use the intermediate noisefilter array for everything else) will fix the problem. We might also want to restrict the majority filter to only making changes within a given rule polygon.

area threshold config units in ha

rather than m2

tweak processing of high elevation labels

Exclude alpine/parkland/woodland from the two noise filters to prevent incorrect outcomes:

I ran a hypothetical test where one rule polygon has BAFA and the adjacent rule polygon is CMA. Sometimes the outcome is correct, but there are two outcomes that are incorrect. One is that the entire alpine area becomes either BAFA or CMA. The other is that if one side of the alpine is less than the noise threshold it is deleted and not considered part of the shared BAFA/CMA alpine area (hence my suggestion to ignore Alpine/parkland/woodland during the noise removal step. Small polygons of these types should be removed during the high elevation clean-up step. At least this is what I think is going on. We might need to run more tests of these ‘hypothetical’ situations now.

speed

For larger areas the processing can be sluggish (up to ~1hr for a DEM withSize is 4350, 3984). Consider splitting arrays into chunks and multiprocessing, or investigating a tool like dask.

log config

On model run, write what version of script is used and what config parameters were provided to ini format text file. eg:

[VERSION]
becmodel_version = 0.03

[USER]
rulepolys_file = path/to/input/becmodel.gdb
rulepolys_layer = myrulepolys
out_file = mybecmodel.gpkg
noise_removal_threshold_ha = 15

[DEFAULT]
elevation = elevation.xls
temp_folder = tempdata
out_layer = becmodel
cell_size_metres = 50
cell_connectivity = 1
dem_prefilter = False
high_elevation_removal_threshold_ha = 100
aspect_neutral_slope_threshold_percent = 15
aspect_midpoint_cool_degrees = 0
aspect_midpoint_neutral_east_degrees = 90
aspect_midpoint_warm_degrees = 200
aspect_midpoint_neutral_west_degrees = 290
majority_filter_steep_slope_threshold_percent = 25
majority_filter_size_slope_low_metres = 250
majority_filter_size_slope_steep_metres = 150
expand_bounds_metres = 2000
high_elevation_removal_threshold_alpine = AT,BAFA,CMA,IMA
high_elevation_removal_threshold_parkland = p,s
high_elevation_removal_threshold_woodland = w

note becmodel version commit in logfile

For iterative small changes, it would be useful to note exactly what commit generated a given output - something like this: https://stackoverflow.com/questions/14989858/get-the-current-git-hash-in-a-python-script

This is easy enough on my end but applying this in the GTS environment may not work. Maybe just check to see if git is available and if so, note commit as well as version.

Create smoother output by applying filters to DEM and/or aspect

define lists in config via comma separated strings

Currently we cannot supply parameters with list values (high_elevation_removal_threshold_alpine etc) via config file.
Handle these in config file with a comma delimited string: https://stackoverflow.com/questions/335695/lists-in-configparser

remove second majority filter

From Adrian:

The second majority filter was creating ‘islands’ of alpine/parkland along ridges that were then being removed by the subsequent noise filter. These deleted alpine/parkland areas were present in the AML map output. I commented out the second majority filter and the output looks more consistent with the AML method.

source rule polygon linework in output

Output should use source rule polygon lines rather than rasterized linework. This is likely a post-processing puzzle piece step.

aspect classification

With only 3 aspect classes (warm/cool/neutral), the results of aspect rule classification can be inconsistent depending on the orientation of valley features. Valleys oriented E-W may be less likely to have side-valleys reclassified due to aspect than subvalleys on larger valleys with diagonal orientation.

To apply aspect changes consistently, we could increase the number of classes and (internally) expand the elevation table.

To support this with maximum flexibility (coolest/warmest may not be 0/180°N, aspect zones may not all be the same size), allow the user to supply custom aspect classes and build an aspect class dictionary in the form{minimum_aspect: aspect_class}, starting at 0 degrees.

Define breaks by starting at 0°N, then moving around the compass, creating breaks at the minimum angle of each class. Define at least 3 aspect_class values, where aspect_class = 1 is the coolest and max(aspect_class) is the warmest.

Example 1 - current aspect classes

Example 2 - shifting current zones 10° west

Example 3 - 10deg classes, originating at 0° N

DEM resampling

If comparing the results of becmodel to legacy AML, we need to know how the DEM is downsampled when downloaded via bcdata. See smnorris/bcdata#63

use external terraincache code

https://github.com/smnorris/terraincache just needs a few tweaks and upload to pypi.

tidy and enhance documentation

Go through README and verify/update/improve.
Potentially add script to export to pdf or similar.

process rule polygons individually

Processing rule polygons one at a time will do a couple of things:

ensure that we only aggregate parkland / scrub / woodland areas of the same type when removing areas below parkland_removal_threshold
enable easy re-incorporation of source rule polygon linework into output polys (#8)

process parkland

Remove parkland areas below parkland_removal_threshold via a similar technique to the noise removal. Parkland is identified in the class_name column of the elevation table. Exact logic of aggregation/removal needs to be taken from the legacy aml file.

output attributes

add: AREA_HECTARES, to 1 decimal place
remove: becvalue
rename: beclabel to BGC_LABEL

test best config performance on GTS

what files to have in temp folder?

add dem_path to config

Sometimes the TRIM DEM via WCS is not good enough.

The WCS DEM is clipped to the BC boundary... but the original TRIM files contain elevations outside of BC (along borders that are not defined by lat/lon lines).

We want to use TRIM data that spans borders because CDEM (terrain-tiles source) includes artifacts at provincial borders that throw off becmodel aspect/slope calculations

fix missing elevation file / link in README

request DEM for areas outside of BC

We need to buffer the study area in all areas, not just within BC, otherwise features can be lost (eg, parkland / alpine areas below threshold in BC but above when considering adjacent areas).

If a study area is on the border, pull additional DEM data from terrain tiles on aws: https://registry.opendata.aws/terrain-tiles/.

elevation provides a Python interface to the elevation tiles.

remove or prevent artifacts introduced by filters

Fix these artifacts identified by Adrian in the Robson test area:

validate elevation table high elevation beclabels

Ensure:

one each alpine/parkland/woodland code per rule polygon
parkland and woodland columns are equivalent except for 7th char
for high, there is a distinct beclabel corresponding to the beclabel for parkland/woodland code minus 7th char, plus whatever variant

revert to processing entire study area

Because retaining rule polygon vector lines has too many trade-offs, do not clip rule polys, just process the entire study area.

Do we continue to support beclabel AT un ?

It is not in the catalogue.

becmodel.util.DataValueError: These beclabel(s) in elevation table are misformatted or do not exist in bec_biogeoclimatic_catalogue: AT  un   , BAFA     , IMA      , SBS dh1

dynamically create values for aggregating high elevation labels

currently:

        high_elevation_aggregates = {
            "alpine": 63000,
            "parkland": 64000,
            "woodland": 65000,
        }

this is why the majority filter is currently so slow

  /Users/snorris/miniconda3/envs/becmodel/lib/python3.7/site-packages/skimage/filters/rank/generic.py:119: UserWarning: Bad rank filter performance is expected due to a large number of bins (65001), equivalent to an approximate bitdepth of 16.0.

rulepolygons

a separate rule poly table may not be necessary, the rules can simply come from the input polygons

add connectivity as a config item

Required for morphology operations and export to polygon.

1 = 4 cell connectivity
2= 8 cell connectivity

use becvalue from whse as internal becvalue ids

Just so symbolizing temp QA rasters is consistent and easy.

output BGC_LABEL type

9 characters maximum

clip outer boundary with vector rule polys

The outer boundary of the study area / rule polygons should be derived directly from the input rule poly vector linework (not a stepped line based on the rasterized rule polys).

To accomplish this, somewhere in the model process we could aggregate the rule polygons, buffer the result by a distance 2-3x raster cell size and expand the rule polys to fit in this area. Then process the model and clip the resulting output vectors by the original rule poly vectors.

zero / nodata values at rule polygon edges

Where small objects have been removed along rule polygon edges there can be remnant 0/nodata values.

Possible fixes:

buffer the rule polys before running the high elevation noise removal (this is required for #8 anyway)
prevent small objects at edges from being removed (as per legacy aml)

BECModel objects not isolated

Setting the config on one BECModel object affects the config on another, different instance.

This isn't an issue for general usage via command line because a BECModel is only created once, but something is not quite right.

from becmodel import BECModel

A = BECModel()
B = BECModel()
print(A.config["aspect_pre_filter"])
print(B.config["aspect_pre_filter"])
B.config.update({"aspect_pre_filter": False})
print(B.config["aspect_pre_filter"])
print(A.config["aspect_pre_filter"])

output:

True
True
False
False

cell size for majority filter

config majority filter sizes should be metres, not cells (to be consistent with other measures in config, and be resolution independent):

"majority_filter_low_slope_radius": 5,
"majority_filter_steep_slope_radius": 3,

interactive processing

Restructure process slightly to return images for plotting in matplotlib. This should enabling interactive raster creation via manipulation of config in a notebook sesssion - rather than requiring dump to geotiff and opening in GIS.

majority filter results

Check scikit-image - what does majority filter return when there is no single majority within the analysis window?

output layer has no spatial reference / projection

Support projects with no woodland

A given area may not have woodland - for example, zones may progress ESSFmm -> ESSFmmp -> alpine. This is not currently supported

add AREA_HECTARES after clipping to project boundary

expand rule polys by expand_bounds_metres rather than 3pixels

align output to standard

100m Hectares BC raster is the standard alignment, with bounds of:

159587.5 173787.5 1881187.5 1748187.5

noise_threshold = int(
        config["noise_removal_threshold"] / 625
    )

simplify elevation table

Just supply either the low or the high elevation value for each aspect, we can build the ranges internally.

eg, using high value:

POLYGON_NUMBER,COOL_MAX ,NEUTRAL_MAX ,WARM_MAX ,BECLABEL
121           ,10000    ,10000       ,10000    ,AT  un  
121           ,2050     ,2100        ,2150     ,ESSFwcp 
121           ,1850     ,1900        ,1950     ,ESSFwcw 
121           ,1700     ,1750        ,1800     ,ESSFwc 3
121           ,1450     ,1500        ,1550     ,ESSFwk 2

automatically number temp layers for qa

just to save fiddling when adding / removing processing steps