Comments (12)
Thank you very much for these suggestions @Jo-Schie! I'll have a think about this and begin incorperating your feedback later this week.
from wdpar.
Sorry, I've been swamped and am only getting to this today.
from wdpar.
Thanks again for providing all these suggestions @Jo-Schie! I've created a PR with all the changes to incorporate your feedback (see #57). To help make it clear exactly how I've addressed each of your points, I've responded to each of them below (separately) and copied in the updated text from the PR. Since JOSS policies favor shorter publications [1], I've tried to address several of the comments in the same sentence or paragraph. I think I've responded to all your points, but please let me know if I have missed anything?
-
Overview of the applied geo-processing steps in wdpa_clean. -> maybe this could be also a graphical representation.
I have deliberately written the paper to avoid describing the internals of the package in a high level of detail. This is because the WDPA and WDOECM will likely be updated at some point in the future which, in turn, would require changes to the package. If I updated the manuscript to describe the data cleaning procedures -- as they are currently implemented -- in a high level of detail and the package had to be updated in the future, this would cause the published paper to be incorrect. Indeed, I fear that a discrepancy between the published paper and the package documentation would be very confusing for users. Although I appreciate your feedback, I don't think this suggestion would improve the manuscript.
-
which geo-processing steps from wdpa_clean affect the original geometries?
I have updated the manuscript to make it clear which steps involve geoprocessing. It now includes the following sentences.
"These procedures include excluding areas that have yet to be fully implemented, areas that are no longer designated, and UNESCO Biosphere Reserves [@r1]. They also include geoprocessing procedures, such as repairing invalid geometries in spatial boundaries, buffering areas represented by point localities [@r3], and removing spatial overlaps [@r2]."
-
How do they affect the geometries and what are consequences for the use (e.g. question of scale)?
I have updated the manuscript to describe the consequences of using different levels of precision at different scales. Specifically, it now contains the following sentences.
"For example, the precision of spatial data processing procedures can be increased so that they are suitable for local scale analyses. This is especially important because the default precision may remove smooth edges at fine scales."
-
How do default settings affect geometries (discuss that default is country scale analysis)?
I have updated the manuscript to describe how the default settings affect the geometries, and note how the default settings are for national-scale analyses. It now includes the following sentences.
"Although the default settings for the data cleaning procedures are well-suited for national scale reporting of protected area coverage, they can be customized for other applications. For example, the precision of spatial data processing procedures can be increased so that they are suitable for local scale analyses. This is especially important because the default precision may remove smooth edges at fine scales."
-
How can we fine-tune settings e.g. to work on different scales and what are the effects on the geometries?
I have updated the manuscript to mention that settings can be fine tuned to work on different scales (please see text mentioned in previous comment). Because JOSS policies state that the manuscript should not describe API functionality [2], I have not updated the manuscript to mention the specific function or parameter that should be used. Instead, I have provided this information in the vignette, with the following sentences.
"Please note that, by default, spatial data processing is performed at a scale suitable for national scale analyses. If you require data suitable for fine scale analysis, please a higher level of precision when cleaning the data (e.g., using the augment:
geometry_precision = 10000
)." -
How are overlapping polygon boundaries resolved and what are the consequences for analysis -> e.g. if there are two overlapping areas with different IUCN categories, how is the boundary resolved and what are the consequences for calculating area statistics.
I have updated the manuscript to provide this information. It now contains the following sentences.
"Specifically, overlapping geometries are erased such that areas associated with more effective management categories or have historical precedence are retained."
-
I would recommend that you use interactive maps in your vignette instead of the map that you plotted
Thank you for this suggestion! Although I appreciate that this would help with data visualization, I do not believe this would improve the overall user experience. This is because I would need to add the mapview (or leaflet) R package as a dependency which, in turn, would -- because the mapview R package has many, many dependencies (see https://cran.r-project.org/package=mapview) -- result in excessive installation times and a needlessly large installation footprint for users that wish to try out the package. I strongly believe that it is my responsibility to ensure that the software I write does not needlessly pollute my users' computers.
[1] JOSS policies state that the manuscript should be a "short paper" and less than 1000 words (https://joss.readthedocs.io/en/latest/submitting.html)
[2] JOSS policies state the "the paper should not include software documentation such as API (Application Programming Interface) functionality, as this should be outlined in the software documentation" (https://joss.readthedocs.io/en/latest/review_criteria.html).
from wdpar.
Hi @jeffreyhanson : Thanks for the edits.
Some quick remarks:
as to6)
you sate that "Specifically, overlapping geometries are erased such that areas associated with more effective management categories or have historical precedence are retained."
-> What if there is an overlap of "old area/soft protection" with "new area/strong protection"? Is there some decision tree? If so it would be useful to illustrate it or link to the relevant literature that suggests the decision hierarchy at this specific point in the manuscript.
As to 1).
I think a short description of the processing steps should be somewhere. I can follow your argument that you do not want to put it into the paper because of eventual changes to the data and the associated processing steps... but then I would put it in the package vignette and reference from the paper to it. By this you can keep it updated. Otherwise it is not clear to users what really happens to the data in the background, unless they are willing and able to understand your R-code inside the package. The transparency though is a necessary precondition to use the package for scientific analysis.
As to 7)
From my understanding you do not need mapview
as a dependency in your package if you create the vignette with it. the Vignette in the end is the rendered html version of your rmarkdown document... are you sure you would need mapview
as a package dependency if you only use it in the vignette?
I think an alternative could be that you include some graphics in your documentation (as in my comment above) that show how geometries are affected by different settings in the package. Nevertheless I think an interactive map is much better for the user to explore what happens then some graphics.
I am thinking of this as a user who is interested in precise area information and I myself used your package with the default settings for quite some time without really checking what happened to the areas (because R is also not very convenient for that). The package vignette might be a great place to make users aware of that since it is the most relevant document that users of your package reference to.
from wdpar.
@Jo-Schie - thanks for getting back to me so quickly, I really appreciate it!
I'll respond to each of your follow up comments below (seperately).
-
What if there is an overlap of "old area/soft protection" with "new area/strong protection"? Is there some decision tree? If so it would be useful to illustrate it or link to the relevant literature that suggests the decision hierarchy at this specific point in the manuscript.
Yeah, the overlaps are handled such that areas with "weaker protection" are erased, and in case of ties -- when there are two/multiple overlapping protected areas with the same protection level -- the younger areas are erased. I remember that Runge et al. (2015) handled overlaps in a similar manner, wherein overlapping parts of protected areas with a weaker protection were erased. I've re-written the sentence to make this clearer and cited Runge et al. (2015). It now reads as follows.
"Specifically, overlapping geometries are erased such that areas associated with more effective management categories are retained [r22] and -- in cases where geometries with the same management category overlap -- areas associated with historical precedence are retained."
-
I think a short description of the processing steps should be somewhere. I can follow your argument that you do not want to put it into the paper because of eventual changes to the data and the associated processing steps... but then I would put it in the package vignette and reference from the paper to it. By this you can keep it updated. Otherwise it is not clear to users what really happens to the data in the background, unless they are willing and able to understand your R-code inside the package. The transparency though is a necessary precondition to use the package for scientific analysis.
I completely agree that it is important to describe the data cleaning procedures in detail. I concur, it is indeed a necessary precondition to use the package for scientific analysis. Although the procedures are not fully described in the vignette, the documentation for the
wdpa_clean()
function describes the procedures in a high level of detail (see https://prioritizr.github.io/wdpar/reference/wdpa_clean.html).Specifically, the data cleaning procedures (currently) comprise 20 individual steps. For each of these steps, the documentation describes (where relevant) the underlying function(s) (e.g., step 6 uses the
sf::st_wrap_dateline()
function); parameters that alter the behavior of the procedures (e.g., step 3 involves removing UNESCO Biosphere Reserves, and the documentation says that theexclude_unesco
parameter can be used to avoid this); relevant columns (in attribute table) of the data (e.g., step 9 buffers points by their reported area, and the documentation says which column in the data contains the reported area values); and relevant citations (e.g., step 3 involves removing UNESCO Biosphere Reserves, and Coetzer et al. 2014 is cited to provide readers with information on why this is important).Since the data cleaning procedures are fully described in the function documentation, I do not think repeating this information in vignette would improve the package documentation. Indeed, I worry that providing such a large amount of dense information in the vignette would be confusing, distracting, and intimidating for new users. This is because the vignette aims to provide a brief introduction -- as indicated by the "Getting started" header in the package website -- rather than a comprehensive manual. How does that sound?
-
From my understanding you do not need mapview as a dependency in your package if you create the vignette with it. the Vignette in the end is the rendered html version of your rmarkdown document... are you sure you would need mapview as a package dependency if you only use it in the vignette?
Yeah, I'm fairly sure. Specifically, the mapview package would need to be listed as an optional dependency in the Suggests field of the DESCRIPTION file. This is based on the official "Writing R Extensions" document -- describing best practices and criteria for CRAN publication -- which states that "The 'Suggests' field uses the same syntax as 'Depends' and lists packages that are not necessarily needed. This includes packages used only in examples, tests or vignettes (see Writing package vignettes), and packages loaded in the body of functions". Even though the dependency is optional, users would still need to download the mapview package -- and all of its many, many dependencies -- to run the vignette locally. Although it would be technically possible to add interactive maps and not list the mapview package as a dependency, this would violate CRAN policies and mean that I can no longer make the package available on CRAN---which I believe is not a desirable outcome. As such, I maintain that adding interactive maps to the package documentation would needlessly pollute users' computers and degrade their overall experience. Is that reasonable?
-
I think an alternative could be that you include some graphics in your documentation (as in my comment above) that show how geometries are affected by different settings in the package. Nevertheless I think an interactive map is much better for the user to explore what happens then some graphics. I am thinking of this as a user who is interested in precise area information and I myself used your package with the default settings for quite some time without really checking what happened to the areas (because R is also not very convenient for that). The package vignette might be a great place to make users aware of that since it is the most relevant document that users of your package reference to.
Thanks, I really like that idea! I've updated the vignette to include a new section that examines the consequences of using the default parameters for a local scale analysis (titled "Recommended practices for local scale analyses", please see attached html file). The new section provides recommendations for local scale analyses (i.e., suggesting which parameter should be customized), shows how the default parameters can result in protected area boundaries that are not suitable for such analyses, and shows how customizing the parameters can address this issue. As you suggest, the vignette uses graphics -- specifically, static maps -- to help readers see clearly why it is important to customize the spatial precision of the data cleaning procedures for local scale analyses. What do you think?
References
Coetzer KL, Witkowski ET, & Erasmus BF (2014) Reviewing Biosphere Reserves globally: Effective conservation action or bureaucratic label? Biological Reviews, 89: 82--104.
Runge CA, Watson JEM, Butchart HM, Hanson JO, Possingham HP & Fuller RA (2015) Protected areas and global conservation of migratory birds. Science, 350: 1255--1258.
Updated vigntte html file is available here: wdpar.zip
from wdpar.
@Jo-Schie, I just wanted to follow up and ask if the PR address your concerns on this issue?
from wdpar.
Hi @jeffreyhanson . I did not really realize that you'd already worked on this. I put it on my todo list and will come back asap.
from wdpar.
No worries - thanks! Please let me know if anything's not clear and I'll get back to you as soon as I can.
from wdpar.
@Jo-Schie, I just wanted to follow up and ask if you'd a chance to go through the PR?
from wdpar.
@Jo-Schie, it's been over a month since I originally created the PR to address your feedback, is there anything I can do to help make it easier for you to review it?
from wdpar.
hi @jeffreyhanson. Documentation looks very good to me and I fully understand that you do not want to add the mapview package if it creates a dependency for the package. Also it makes sense to have the processing steps in the function documentation and I think you do explain it in a great level of detail. Sorry for the delays. Happy to close the issue now.
from wdpar.
Awesome - no worries - thank you very much!
from wdpar.
Related Issues (20)
- Request to download individual PAs HOT 2
- Feature Request: Add functionality to keep UNESCO sites and not yet implemented areas HOT 15
- HTTP error 404 in "global" query with wdpa_fetch HOT 3
- Fails to download country-level data HOT 2
- Fails package checks under noSuggests config
- Poor internet connection breaks wdpa_fetch HOT 9
- Port error HOT 10
- It takes forever to eraseoverlap for global dataset. HOT 22
- wdpa_fetch HOT 6
- I used the newest version of wdpar, still have this "side location conflict" HOT 7
- The package not run for previous download data, and do not start a new download data HOT 6
- GEOS version sensitivity HOT 7
- upcoming sf breaks wdpar HOT 6
- JOSS Review: Improve Statement of need / description of use-cases HOT 16
- JOSS Review - Add links to references cited in README HOT 3
- JOSS Review warning about out of date local data HOT 6
- JOSS Review: Add small POC about Performance claims HOT 7
- JOSS Review: Add statement about state of the field HOT 4
- wdpa_fetch() no longer works HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wdpar.