Giter Club home page Giter Club logo

gutils's Issues

Segment_id

just find this is wrong place ask to question, so i close this

echometrics improvements

Continued from PR cf-convention/vocabularies#186

  • Use pytest test_slocum.py to exercise echometrics/pseudogram code to produce desired netCDF results
  • Always assign echometrics variables with extras dimension even if pseudogram is missing
  • Wire acoustic sensor configuration through deployment.json using extra_kwargs
  • apply extra_kwargs to ascii to nc conversion
  • tests/test_slocum.py::TestEcoMetricsThree::test_pseudogram produces three netCDF files that need to be consistent
  • Manual running of ascii/netCDF produces one file; pytest produces three files [differences in json config files]

Deferred:

  • apply extra_kwargs to dbd to ascii conversion (no current hooks to grab extra_kwargs from deployment.json)

Assign_profiles() has errors

I found two bugs in the GUTILS assign_profile() function.
For line 77 in assign_profiles.
inflections = np.where(np.diff(delta_depth) != 0)[0]

inflections = np.where(np.diff(delta_depth) != 0)[0]

This line will think each extreme point is a new profiles.
128 129 130 131
For example, delta_depth = [ .., .., 1.0, 0.0, -1.0, -1.0, …..], then
inflections will be [0, 128, 129, 200, ….]. For 128 ~129 is a little profile for extreme point, that will looks like the graph below.
οΏΌ
screen shot 2018-03-12 at 9 47 13 am

and I guess the sampling algorithm, between line 57-66 you get sampling data depending on the time interval(tsint), which cases the offset issues in the graph below
screen shot 2018-03-12 at 9 51 25 am
I think sampling data can find extreme points, but can’t get a very precisely extreme point.

Allow users to speficy a CAC file if metadata is not contained in the binary files

For practical purposes, this sensor list is very rarely included in each file due to it's large size relative to the actual sensor data (as an example, the file binary file I'm working with is 78680 bytes and the ascii header takes up 78481 bytes). The operator can choose to either 1) Include it in every file, 2) include it in no files if there is already a copy of the sensor list on shore, 3) Include it in just the first file of the mission.

For our real-time operations, we generate this file (called a cache or cac file) on shore and then set up the glider to NEVER include it in the data files.

Once the files are on shore, the dbd2asc executable takes one or more binary data files and, using the sensor list either contained in the file or the location specified via dbd2asc -c /PATH/TO/SENSORLILST.cac, converts the binary data to ascii. As far as I can tell, GUTILS provides no option to specify an external cac file. My guess is that the vast majority of users who would potentially use this repo will also NOT transmit this sensor list in every binary data file, but can't say for sure.

utilization of parquet for intermediate storage

There are several steps involved in migration to parquet for intermediate processing of slocum glider data.

  • ensure dbdreader reproduces similar results to slocum binaries (smerckel/dbdreader#18)
  • replacement of convertDbds.sh with dbdreader/parquet
  • desired storage pattern for parquet (just using tables for now)

Is there a particular storage pattern or design desired for the parquet data structures?
REF: https://arrow.apache.org/docs/python/parquet.html#parquet-file-writing-options

  • Enforce version 2.4? 2.6?
  • Ensure the structure is queryable by time for speedy subsetting?
  • If enforcing 2.6, timestamp units become less an issue
  • Partitioning? Glider ID, Deployment ID, Process method (rt vs. delayed), QC'd (Level 0, 1, ...)

Have to nail down a potential dbdreader issue first.

get_decimal_degrees() has errors near equator

From Kerfoot:

I found a bug in the GUTILS get_decimal_degrees() when trying to convert
small (near the equator) GPS positions from NMEA coordinates to decimal
degrees:

https://github.com/SECOORA/GUTILS/blob/master/gutils/gbdr/methods.py#L281

I rewrote the function to do a straight mathematical conversion instead
of converting to a string, parsing, etc. Here's the code:

def get_decimal_degrees(lat_lon):
     """Converts glider gps coordinate ddmm.mmm to decimal degrees dd.ddd

     Arguments:
     lat_lon - A floating point latitude or longitude in the format ddmm.mmm
         where dd's are degrees and mm.mmm is decimal minutes.

     Returns decimal degrees float
     """
     # Absolute value of the coordinate
     try:
         pos_lat_lon = abs(lat_lon)
     except (TypeError, ValueError) as e:
         return

     # Calculate NMEA degrees as an integer
     nmea_degrees = int(pos_lat_lon/100)*100

     # Subtract the NMEA degrees from the absolute value of lat_lon and divide by 60
     # to get the minutes in decimal format
     gps_decimal_minutes = (pos_lat_lon - nmea_degrees)/60.0

     # Divide NMEA degrees by 100 and add the decimal minutes
     decimal_degrees = (nmea_degrees/100) + gps_decimal_minutes

     if lat_lon < 0:
         return -decimal_degrees

     return decimal_degrees

Unstable `axiom/gutils:latest` Docker image

I saw in the push.yml workflow (which runs on push and PRs) that the image is built and uploaded to axiom/gutils:latest as part of testing. This might cause some confusion since latest in other projects usually refers to the latest stable version, rather than testing.

- name: Build and push
uses: docker/build-push-action@v2
with:
push: false
tags: axiom/gutils:latest
cache-from: type=local,src=${ BUILDX_CACHE }
cache-to: type=local,dest=${ BUILDX_CACHE }
outputs: type=docker

- name: Push latest image to Docker Hub if on master or main branch of the repo
uses: docker/build-push-action@v2
with:
push: true
tags: axiom/gutils:latest
cache-from: type=local,src=${ BUILDX_CACHE }
cache-to: type=local,dest=${ BUILDX_CACHE }

I'd recommend building locally for testing, or using the git SHA as tag.

The publish workflow (tagging with the version name in publish.yml) is correct though, so should be used instead for production.


Not a big issue, just an FYI for the community.

pyupgrade

Suggestions

  • Wait on #22 (to avoid merge conflicts)
  • Add pre-commit.ci to repo Pre-commit is already in the push.yml GHA workflow
    • Pros: Pre commit hooks run in CI for pull requests (independent of dev install). No associated cost for public repos.
  • Modify .pre-commit-config.yaml to add pyupgrade
    • Pros: Automatically remove outdated Python syntax, and convert code to take advantage of new features (e.g., f-strings, remove encoding comments)
    • Cons: Requires explicitly dropping support for Python versions as per arg in pyupgrade workflow

Addition to .pre-commit-config.yaml

- repo: https://github.com/asottile/pyupgrade
  rev: v3.15.0
  hooks:
    - id: pyupgrade
      args: [--py36-plus]

`gutils` not available on Anaconda, and other questions

Hey there. I was very happy to find a FOSS package to interact with data output from Teledyne Webb Slocum Gliders.

Following the setup instructions in the readme, I found that gutils isn't on Conda forge. I assume this method of installation doesn't work (yet?)

It would be great to know:

  • how feature complete this package is
  • whether this package is in active development/what the current focusses are

If contributions are welcome, I'm happy to open some PRs tackling different areas (assuming my supervisor gives me the go-ahead).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.