Giter Club home page Giter Club logo

climate_indices's People

Contributors

benjimin avatar dawiedotcom avatar deepsource-autofix[bot] avatar deepsourcebot avatar dependabot[bot] avatar laura-guillory avatar monocongo avatar nnayda avatar oshin94 avatar snyk-bot avatar weathergod avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

climate_indices's Issues

Make processing codes more generic to timeseries type (divisions, grid, stations)

Since there's nothing nClimGrid or nClimDiv specific to the processing codes we should make them more generic for the type of time series being processed: grid, division, or station. Most likely once we've written the station processor code (process_stations.py) then there'll likely be quite a bit of code that can reasonably be factored out to a parent class and/or a core processing module (perhaps in indices.py, maybe better so as to not introduce object-oriented features into the code which may make numba optimization more difficult). Once this is complete it will make it easier in the future to create additional time series type processors using the approach used in the new/refactored processor codes.

Resolution of this issue should be three new files:

process_divisions.py
process_grid.py
process_stations.py

...as well as an updated indices.py

Migrate from Markdown to reStructuredText for documentation

  • use reStructuredText for updated README
  • establish documentation on readthedocs.io
  • additional use cases such as SPI only scripts added to README, perhaps migrated to separate repositories once the scripts are migrated out of main indices_python project into separate non-core, task-specific projects
  • modify the section with Anaconda info to specifically reference Anaconda3 and Miniconda3 to avoid confusion (thanks to Qing Yang for making this evident)
  • update the section describing how tests should be run, relevant to the new approach made possible as a result of #64
  • remove the instructions for installing conda modules piecemeal, as this is probably more confusing than helpful

https://docs.readthedocs.io/en/latest/getting_started.html
http://ericholscher.com/blog/2016/mar/15/dont-use-markdown-for-technical-docs/

Coverage/Coveralls configuration

Code coverage information is not being generated and reported to Coveralls, as evidenced by the Coveralls page for the project continually reporting no data available. Suspected errors or incomplete configuration for coverage, (perhaps focus on .coveragerc file?), update to fix.

This is done/complete once we see some movement of the coverage percentage on the README's Coveralls badge, this is currently unaffected by settings that should be taking effect, such as pragma: no cover comments, etc. (stuck at 12%)

Fix 'Complex Method' issue in palmer.py, _pdsi_from_zindex(() function

Investigate whether or not this code's logic follows closely what is described in the relevant literature (Palmer 1965, Wells 2004).

If so then look into how the code can be simplified and/or modularized further. Use vectorization where possible. Reduction of cyclomatic complexity encouraged where reasonable.

If not then address the divergence from accepted/established algorithm, remeasure, repeat.

CodeFactor found an issue: Complex Method

It's currently on:
palmer.py:989-1169
Commit fdd08e5

Publication with JORS?

I'm an associate editor with the Journal of Open Research Software and just wanted to reach out to ask if you'd considered publishing a software article so that you can get academic credit (i.e. citations) for all the hard work involved in releasing and maintaining your software?

Replace Palmers in process_grid.py

Palmer-specific code was commented out at last commit of this code, replace so we can again compute PDSI etc. with this processing script.

License change

Actual license is GPLv3.

Maybe a more liberal license (MIT, BSD,...) would be a better option to let people adopt the package.

Create develop branch

Create develop branch in order to have a staging area for latest code that is under development. A to-be-determined practice can then be put into place for the promotion of code into the master branch, maybe with the master branch serving as the more or less static place where the latest released/tagged version resides?

Fix 'Complex Method' issue in palmer.py, _z_sum() function

Investigate whether or not this code's logic follows closely what is described in the relevant literature (Palmer 1965, Wells 2004).

If so then look into how the code can be simplified and/or modularized further. Use vectorization where possible. Reduction of cyclomatic complexity encouraged where reasonable.

If not then address the divergence from accepted/established algorithm, remeasure, repeat.

CodeFactor found an issue: Complex Method

It's currently on:
palmer.py:1501-1634
Commit fdd08e5

Update .travis.yaml

I think with the new changes in setup.py the following can be done:

  • Remove requirements.txt
  • Remove environment.yaml
  • Update .travis.yaml. Instead of this:
  # environment.yml contains the dependencies, for an environment named 'indices_python' 
  - conda env create -q -f environment.yml
  - source activate indices_python
  - python setup.py install

script: 
  # run all tests with coverage 
  - export NUMBA_DISABLE_JIT=1  # disable numba JIT
  - coverage run --source=indices_python -m unittest tests/test_*.py

use this:

  - conda env create -n indices_python
  - source activate indices_python
  - pip install .

script: 
  # run all tests with coverage 
  - export NUMBA_DISABLE_JIT=1  # disable numba JIT
  - coverage run --source=indices_python setup.py test

I don't send a PR because maybe you will need some trial'n'error and maybe it is easier if you do direct commits to check if it is working.

Create separate code to perform comparison analysis of results

Compare results of new code against operational results from NCEI climate divisions with the following variables computed:

For each division:

  1. differences by calendar month, with line plots (expected, actual, diffs) (use matplotlib)
  2. differences per month, with diff maps (use either WCT or Fenimore's process)
  3. RMSE, by calendar month, and total
  4. % with change of sign, by calendar month, and total
  5. % with positive bias, % with negative bias

Write the code in a fashion that makes it easy to refactor out a base class for reuse with code that will do this for grids, so we can do a similar comparison using WRCC/WWDT PRISM datasets.

Add SPI/gamma for a sliding daily index

We need an SPI that can be computed on a daily basis, using a sliding X-days scale rather than the X-month(s) scale with a calendar month granularity that's currently in place for all the scaled indices (SPI, SPEI, PNP, etc.) Once this is complete for SPI it should be straightforward to flesh this out for the other scaled indices as well.

Missing data in SPI results

SPI processing results show missing values in locations where we expect data. See attached image result of recent CMORPH SPI processing (gamma with missing data vs. Pearson3)

cmorph_spi_gamma_01_0000-0359 cmorph_spi_pearson_01_0000-0359

AppVeyor integration

AppVeyor was included in the webhooks(?) for this project, and an attempt was made to remove this integration once it appeared to me that AppVeyor was primarily for .NET projects. It looks like the dis-integration is incomplete because we still see an AppVeyor error included in the checks on the project, causing a red X to appear next to the repo name. Further look at other AppVeyor Python projects shows that AppVeyor is also useful for Python so we should instead fully integrate in order to investigate whether or not AppVeyor is a good service to have tied to the project as it appears to be.

  • appveyor.yml file
  • project config on AppVeyor site for project

Create a base test case class containing fixture members

Extend unit tests from a base class that'll contain commonly used fixtures, primarily numpy arrays we should match when computing indices and intermediates in the unit tests. This will eliminate code duplication and allow for keeping fixture data in a single class.

Define python versions supported

It is not pretty clear which python versions are supported right now.

As it is a new package I think you should support most moder python versions, Python >= 3.5, but it should be clariffied somewhere (docs, readme, trove classifiers in setup.py,...)

I've added python 3.5 and 3.6 in #64 but it hould be ammended iy you have other plans.

Add Tweedie distribution fitting

A paper describes how the Tweedie distribution fitting can be used for drought monitoring using streamflow data: Statistical distributions for monthly aggregations of precipitation and streamflow in drought indicator applications; Svensson, Hannaford, and Prosdocimi, 2017

This is probably best tackled by creating a function in compute.py named transform_fitted_tweedie(), along the same lines as the existing transform_fitted_gamma() and transform_fitted_pearson() functions. Then these can be used to create spi_tweedie() in the indices.py module.

Credit to Curtis Riganti for highlighting the utility of this additional fitting. Thanks!

Remove pycurl dependency

You are using pycurl to perform file downloading. I think you could easily remove this dependency using urllib included in the stdlib so users don't have to install a third party dependency.

Even, you could remove the retrieve_file function from the utils module as you could use directly the following:

Now you have:

from indices_python.utils import retrieve_file

retrieve_file(url, local_file)

You could remove retrieve_file function and pycurl dependency and do the same using:

from urllib import request

request.urlretrieve(url, local_file)

So less dependencies, less code, less tests using a battle tested stdlib functionality. WIN-WIN.

Speed up CI

Travis CI builds current take 4 minutes or so, and this can perhaps be reduced significantly by adopting new build practices such as dependency caching.

Name change for project

Change name in order to more uniquely/accurately reflect the nature of the project. Suggestions are very welcome!

repo reorganization

The repository is a little bit messy right now.

Folders like misc, notebooks, example_inputs or scripts should be rethinked.

Do you want to provide examples of library usage? Try to organize misc, notebooks, example_inputs or scripts all in notebooks in the notebooks folder.

Do you have other plans for these folders? Just try to clarify about it and separate what is the library itself from what is documentation/examples, tests, ci,...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.