Code emerging from the 2024 IMOS Hackathon
aodn / imos-hackathon Goto Github PK
View Code? Open in Web Editor NEWCode emerging from the 2024 AODN Hackathon
License: GNU General Public License v3.0
Code emerging from the 2024 AODN Hackathon
License: GNU General Public License v3.0
see: #26 (comment)
In moving from NetCDF
to more cloud optimised datasets like parquet
we need to address the changes in how "meta-data" is addressed. The bottom line is that without the global
attributes available in NetCDF
we'll need to cary over, for each record in the dataset, some of this "meta-data" as "data" columns for each spatial and time point record.
The assumption is that while a duplication of bytes in the file and "wasteful" of storage that the real-world impact of the extra size in terms of resource costs or access time won't matter. (??)
To be calculated per individual/tag
Divide summaries based on shelf and off-shelf (200 m bathy line). @fjaine to confirm how to identify on and off shelf locations from summary data.
GFW data download has been sorted in #29
Documentation needed with instructions on how to use it.
variables, frontal index, depth, sst, chl-a, salnity, currents, ssh, climatologies?
Start simple . . . . just a dictionary.
Comparing the outputs from the manual selection of MLD depth using our GUI to different automated methods.
Here we will need some work done, as we need to find a way for the same profiles that were selected at random to be loaded.
Ultimate plan: send to people as if it were a game, so we can build a good size randomic dataset to use for the purpose of comparison with automated methods.
GetAodn
class?GetAodn
provide a lazy data frame object - dask dataframe? Access parquet
with dask
?Global attributes from netcdf files don't get carried through to parquet format.
What are the implications for the user with loss of metadata with parquet formats?
Depends on issue #32
Plots to be done per species and per individual/tag
Michael has some code that does this that we can modify or use to close this issue.
To be calculated per tag
Calculations per tag/individual per day
YAML
registry files - first stepsIn AODN reloaded
there will be YAML
registry file for each parquet
or zarr
collection. The ideas discussed by @lbesnard point at the opportunity to make use of the YAML
registry files for data discovery and cataloging.
Explore the possibility of creating a catalogue of existing datasets (from the cloud-optimised products) and accessing them via Python.
Create a Parquet version of the IMOS - Moorings - Hourly time-series product
Looks like the DMS have also done this, but probably only for sites on the GBR: https://github.com/aodn/rimrep-examples-private/blob/main/examples/poc/python/anmn-ltsp-hourly.ipynb
Figures TBA
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.