Giter Club home page Giter Club logo

issues-and-projects's People

Contributors

abby621 avatar chris-schnaufer avatar craig-willis avatar czender avatar dlebauer avatar flyingwithjerome avatar fwang29 avatar hmb1 avatar jdmaloney avatar julianpistorius avatar max-zilla avatar nfahlgren avatar smarshall-bmr avatar tingli3 avatar tinodornbusch avatar weiqin61 avatar zongyangli avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Forkers

julianpistorius

issues-and-projects's Issues

IRODS ingestion of gantry data

Task to do

Set up and document an automated process for ingesting gantry data into CyVerse Data Store (IRODS)

Reason

  • Ease of access from VICE (for workbench/data exploration)
  • Close to CyVerse & UA computational resources

Result

  • Scripts and configuration necessary for setting this up
  • Document (as text file in the code repository where the automation & logic end up)
  • Monitoring & alerting (at least documenting what success/failure looks like)

Steps to take

  • Julian to talk to Tony, Edwin and Max to see what's involved in getting this done, and then add next steps below

Provide technical documentation for base template-lidar-plot

Task to do
Create technical documentation for template-lidar-plot and add it to the Organization web site

Reason
Documenting technical approach to repo, and what's provided by code is helpful for developers

Result
Technical documentation on the Lidar plot-level template is available

Note:

  1. starting with the existing template-rgb-plot technical documentation would most likely shorten the timneframe needed to produce the documentation

Create base plot-level Lidar Docker image to be used with the template-lidar-plot derived repos

Task to do
Create the base image used as with the template-lidar-plot repo. This is similar to the rgb-plot-base-transformer folder, except for Lidar [try the following link if the previous one doesn't work and browse to the a branch that has the folder name: https://github.com/AgPipeline/drone-pipeline-transformer]

Reason
As part of making writing Lidar plot-level algorithms easier, a base image is needed to provide the context the algorithm runs in

Result
Ability to create new Docker images that can be used as the basis for plot-level Lidar algorithms

Steps to take

  • Create a new folder, next to common-image in ua-gantry-transformer repository to hold files
  • Add configuration.py file and fill in with correct information
  • Add transformer.py file that provides the support needed for the lidar plot-level template
  • Add Dockerfile, requirements.txt, and packages.txt files used to build image for Lidar
  • Create README.md file and fill in to document the code
  • Create docker image that can be used
  • Integrate TravisCI to test add_parameters(), check_continue(), and perform_process() functions only (don't add tests for other methods since they'll probably be moved; what's not moved will be added to TravidCI later), and to test Docker image

Note:

  1. Clone the rgb-plot-base-transformer repo to get a quick start on what's needed
  2. There should be quite a bit of overlap between the RGB and Lidar repos - the overlap should be put into a library; which is a separate issue

Add ability for transformers to download files after check_continue() call

Task to do
Enable the download of data after check_continue() call returns with an indication that data can be downloaded.

Reason
For some environments it's better to delay downloading data until conditions are right. This issue is to enable that functionality

Result
The environment is able to download data when needed

Use ua-gantry-pipeline template-rgb-plot to develop new version of Canopy Cover

Task to do
Redevelop Canopy Cover algorithm to use plot-level template

Reason
Reduces the specialized code that needs to be maintained

Result

  • Focussed canopy cover algorithm
  • Testing out new template-rgb-plot code
  • Demonstrates common algorithm sourcing and a possible implementation strategy

Steps to take

  • Create new repo to hold derived algorithms
  • Submodule current plot-level canopy cover (https://github.com/Chris-Schnaufer/canopy-cover)
  • Create Dockerfile to create new image
  • Create any other scripts to assist Dockerfile
  • Test new Docker image
  • Make new image available

Proposal to evolve transformer architecture

Task to do
Write a proposal and plan to modify the transformer architecture

Reason

We want to balance reliability, reproducibility, and developer ergonomics of the pipeline transformers

Result

Steps to take

  • Summarize and justify requirements for transformers (david starts this) #152
  • Give reasons why current solution doesn't meet all of these = generalize to explanation of our decisions, analysis of trade-offs.
  • Write a draft proposal to evolve the current solution so it meets all the requirements
    • migrate this to wiki; summarize etc
    • dig up other stuff too.
    • why dag workflows? (see gist) vs message queue
  • Create an iterative plan with steps to implement the proposal
  • Get approval from David for proposal and the plan to implement it
  • Create epic with issues for the plan

Merge plot-level canopy cover from Chris Schnaufer's repo to AgPipeline repo

Task to do
Due to class requirements, the current canopy cover code was kept intact. Development happened in Chris Schnaufer's UA account. This needs to be merged back into AgPipeline organization.

Reason
Common source for calculating canopy cover

Result
AgPipeline canopy cover will be plot-level based and common with Drone Pipeline

Upgrade jupyter notebook to jupyter lab

Jupyter Lab seems to be the future of jupyter and code notebooks so it may come up that everything we do currently in regular jupyter notebook should work with jupyter lab.

This shouldn't really be a problem since both use the same file, labs simply have more user facing functionality but it's good to keep this in mind.

A small RGB dataset to test pipeline

Task to do
See #46 (comment)

Would it be possible to provide a small dataset that could be used to run this? It could be available on Google Drive under the account that Jorge setup in .tar format?

Reason

To allow easy testing of the pipeline.

Result

One or more sample data sets (different sizes) available from a URL (e.g. Google Drive or CyVerse DE public link)

Process captured RGB data as it arrives through to canopy cover & save data products

Task to do
Pre-production testing of process through final analysis step and store resulting data products

Reason
Ensure environment is setup to correctly process data and all data products are available. Allows follow on testing of plot-level analysis with Urbana generated data

Result
Able to access final analysis results to compare against current system. Able to compare intermediate data products against Urbana generated data.

Fix problems with travis test integration

  • ua-gantry-transformer needs some dependencies that are proving troublesome to install
  • base-docker-support needs a little more input for some tests
  • template-transformer-simple might be missing certain elements in the base repo

I'll have to go over this stuff with Chris to figure out what needs figuring out.

Changes to initial cut of tests for base-docker-support/base-image

Task to do
Change some tests for better coverage

Reason
Changes to help expand testing in the future and for better coverage

Result

  • Run testing code through pylint (using rc file at root of Organization-info repo
  • Place testing files into a test folder as possible (eg: base-image/test)

test_entrypoint.py:

  • test_handle_result(): change 'print' to 'file' and perform a file check

test_transformer.py:

  • transformer.check_continue() actually returns a tuple (template needed updating)

Port laz/ply extractors: las2height

Task to do
Port las2height and panicle_detection extractors to transformers (for Eric's class as well)

Reason
Giving higher precedence to these so they can be used by the class

Result
Dockerized transformers

Ensure all captured data at ua-mac field site are delivered to UA

Task to do
All data captured at the Maricopa field site is shipped to UA.

Reason
This not only is a necessary first step but also allows longer term testing of transfers before entering production mode, determination of space requirements, and other logistics.

Result
Data captures regularly are available for UA processing

Steps to take

  • Data is transferred to UA
  • Data is appropriately stored with no overwrites or deletions in a manner allowing easy discovery
  • Data is available for processing

Create a test cases for new extractor template

Task to do
Create unit and functional tests for the non-docker-image and docker-image template

Reason
Preparation for integrating with TravisCI

Result
A fully tested extractor template

Convert base image container in base-docker-support to use Python3.7

Task to do
To support python 3.7 change the links in /usr/bin for Python3 to point to Python3.7. Also add a link for python to run python 3

Reason
Python 3.7 has additional features that are useful for timestamp conversions and other benefits

Result
Scripts will run Python3.7 by default (and not 3.6)

Command to add to Dockerfile

  • ln -sfn /usr/bin/python3.7 /usr/bin/python3
  • ln -sfn /usr/bin/python3.7m /usr/bin/python3m
  • ln -sfn /usr/bin/python3 /usr/bin/python
  • Create Pull Request

Determine minimum required transfer rate from Cache Server

Based on data transfer rates to date, it is clear that we don't need the 1 Gigabit line that we currently have and costs thousands / month. What is the minimum transfer rate from MAC to the internet that we need so that we can keep up with historical data generation rates?

Discuss w/ JD and Sean Stevens, then let Matt Rahr know requirements

Fix problem when no TIF file specified to process

The current behavior or issue
The code crashes if no tif file is specified on command line.

The steps taken to reproduce the behavior or issue, or specify a location where the steps were recorded
To reproduce, run the code with no TIF file specified on the command line

Expected behavior
An warning gets reports and the code doesn't crash

Add other supporting information that may be useful
https://github.com/AgPipeline/transformer-soilmask/blob/6ec902e26a2d0bc2b33a4ef8aeb2ae051b8bec86/transformer.py#L363

Completion criteria
The code doesn't crash

Move code to retrieve file's EPSG code to Transformer class

Task to do
The current code makes a call directly to terrautils to get the EPSG code. Moving this to the Transformer class allows the removal of a dependency and introduces flexibility benefiting the Drone pipeline through common code

Reason
Moving the functionality allows greater flexibility in providing a common code solution

Result
The dependent code doesn't directly call into terrautils allowing the Transformer class to better provide for its environment

Move common code for rgb-plot-base-transformer and lidar-plot-base-transformer to library

Task to do
Move common code to a library that's published

Reason
Common sourcing plot-level code allows easier updating of dependent applications and Docker images

Result
A published library containing common code for plot-level algorithm bases

Steps to take

  • Move common code to a separate library repo
  • Create appropriate build environment (adding setup.py, etc.) and build library
  • Publish library
  • Create README.md that clearly defines the goals and details of library (such as what online library repos are supported, etc)
  • Integrate with TravisCI to run pylint on code

Makeflow pipeline steps should not call out to external databases

The current behavior or issue

Currently the agpipeline/cleanmetadata and agpipeline/canopycover steps call out to an external database (BETYdb at Illinois).

This will cause reliability, scalability and reproducibility problems.

Expected behavior

Every run of a pipeline (same code and input) should be deterministic and idempotent.

There can be a 'stateful' wrapper around a deterministic core which talks to external systems.

Completion criteria

(DRAFT SOLUTION)

  • An initialization step (pre-workflow-workflow?) to generate text (JSON) files or databases (sqlite or pglite)
  • Use these generated files as one of the inputs to the pipeline (along with current input like image files, etc.)
  • Modify steps which call out to external systems for input: Take the local files created above as their input
  • Modify steps which push results out to external systems: Produce local files as output
  • A finalization step (post-workflow-workflow?) to push the local generated result files to one or more external systems.

See https://github.com/terraref/workflow-pilot for inspiration.

Finish makeflow workflow for drone processing pipeline

Task to do
Finish makeflow workflow for Drone Processing Pipeline

Reason
Migration from Clowder-only workflow

Result
Able to leverage the Cyverse environment

Steps to take

  • Finish workflow .jx file
  • Test workflow
  • Commit workflow to repo

Integrate extractor template with TravisCI

Task to do
After tests are built, integrate with TravisCI

Reason
Able to automatically test changes to extractor template

Result
Full integration with TravisCI and ability to make Pull Requests dependent upon successful CI runs

Create ua-gantry-pipeline template-rgb-plot repo

Take the drone pipeline template-rgb-plot code and develop ua-gantry-pipeline equivalent.

Tasks:

  • Create a ua-gantry-compatible version of the drone pipeline code
  • Common source (library or submodule) the common code in a way to separate the RGB specific functions from the (future) Lidar, FLIR, etc specific code
  • Update/provide documentation for ua-gantry-pipeline implementation

Come up with plan to move TERRA REF data archives from NCSA to UA GDrive

Need to archive data from the Storage Condo and Nearline tape at UIUC on GDrive using Globus transfer.

Coordinate with Sean Stevens to set up a plan. Allocation on Storage Condo officially ends in December (?) and tape in March of next year, so we need to transfer ~1PB of the zipped archived files by then.

Create ua-gantry-pipeline template-lidar-plot repo

Task to do
Create the code structure needed for, and create repo, for simple plot-level lidar algorithms

Reason
Provides simplified interface for implementing plot-level lidar transformers

Result

  • algorithm developers will have minimal work to do to create plot-level lidar transformers
  • lower maintenance costs associated with lidar analysis algorithms through common code
  • able to quickly prove out new algorithms

Steps to take

  • create common code to base template-lidar-plot algorithms on (similar to rgb-plot-base-transformer)
  • create template repo for template-lidar-plot

Modify DPP workflow to add metadata to Clowder instance

Task to do
Enable loading metadata to Clowder

Reason
Leverage Clowder search capabilities

Result
Metadata is available in Clowder

Steps to take

  • Determine what data to store in Clowder (work with David)
  • Add ability to create Clowder datasets (in spaces and collections) to hold metadata to workflow environment
  • Add metadata upload to workflow

Add a base docker image override to configuration.py for template

Task to do
Add another variable to configuration.py that the generate_docker.py script can use as the base image

Reason
Currently there's only a command line parameter override that allows the base image to be changed when generating Dockerfile. This will allow a more permanent change that won't rely on command line parameters being remembered

Result
The variable will allow the override of the base image for Dockerfile, and still allow the command line override to take precedence

Steps to take

  • Update the template-transformer-simple repo to and and use the new variable
  • Document this variable and how it interacts with the command line parameter
  • Propagate update to ua-gantry-transformer and other template repos
  • Create Pull Request

Move canopy height transformer to template-lidar-plot based transformer

Task to do
Use the new transformer-lidar-plot repo as the basis for canopy height analysis

Reason

  • reduces the overhead of maintaining canopy-height transformer
  • proves out the base image
  • provides demonstration of specialized templated transformers

Result
Specialized image for calculating canopy height

Create a template-lidar-plot repo and populate it

Task to do
Create a Lidar plot-level template similar to template-rgb-plot template except dealing with Lidar data.

Reason
Enables plot-level Lidar algorithm developers to easily create working workflow components

Result
Have a repository that can be used as a template to create new workflow algorithms

Steps to take

  • Create algorithm_lidar.py template file (filled in)
  • Create generate.py executable script that's used to create Dockerfile and supporting files
  • Create testing.py executable script to be used when testing algorithm
  • Obtain plot-level lidar files, ZIP them together and place on google drive along side the RGB testing files
  • Create HOW_TO.md file with detailed instructions on how to use new template file
  • Let AgPipeline owner know when new repository is created so correct permissions can be applied to repo
  • Integrate into TravisCI: use testing.py script to validate default algorithm using ZIP of Lidar data (include pylint, etc as well)

Notes:

  1. cloning the template-rgb-plot repo to use as a starting point is probably the fastest approach to finishing this
  2. the generated Dockerfile should reference the Lidar specific base image (created through different ticket(s))

Create Jupyter Notebook on how to develop a transformer

  • Review how other groups have done this (see slack message)
  • Show how to develop a transformer using Jupyter notebook so that it's copy and paste to actual transformer.
  • Create a docker instance for CyVerse DE app with resulting notebook

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.