Giter Club home page Giter Club logo

xarray-ceos-alos2's People

Contributors

dependabot[bot] avatar keewis avatar pre-commit-ci[bot] avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

keewis agrouaze

xarray-ceos-alos2's Issues

organize the metadata from the sar leader file

part of #17

The SAR leader file contains a huge amount of metadata, with lots of sections that on their own are often bigger than the volume leader file. As such, I think it makes sense to have a separate organization issue.

These are the sections that need to be transformed:

  • file descriptor (ignored, since it doesn't contain any new metadata)
  • dataset summary (#30)
  • map projection (#31)
  • platform position (#32)
  • attitude (#33)
  • radiometric data (#34)
  • data quality summary (#35)
  • facility-related data, 1-4 (ignored for now, see #38)
  • facility-related data, 5 (#36)

And finally, they need to be integrated into a single function:

`DataTree` backend

Once #10 is complete, we should be able to expose a DataTree reader function (open_alos2) and a DataTree backend, once that exists (xr.open_datatree)

expose a high-level object for I/O

Now that we can read the files, the next task is to create a I/O object that pulls all the different steps together and exposes the read data (like h5py.File, netcdf4.Dataset, or the zarr store). This involves:

  • reading all the metadata (volume directory, sar leader, sar image metadata, sar trailer, summary) on open and organizing the metadata and data into groups, variables, coordinates, and attributes (#17)
  • optionally create cache files for the image files (#20)
  • lazy access to the data, including chunked reads (#14)

In particular, lazy data access is necessary to be efficient when reading level 1.1 ScanSAR imagery, and allows the use of dask.

test coverage

While having a 100% test coverage is not realistic, it should be above 90%.

In the case of this library, the structure definitions themselves are not really testable (we can run them anyways, but that will only reveal bugs in the helper functions).

different file formats

The CEOS structure contains files with several different (but mostly similar) formats. To make use of all the data available, we need to read

  • the summary file (ascii lines with key / value pairs, general metadata): #4
  • the volume directory file (binary, mostly equivalent to a manifest): #5
  • the sar leader file (binary, mostly metadata): #6
  • several image files (binary, mostly data): #7
  • the sar trailer file (binary, mostly metadata): #9
  • several thumbnail images in JPEG format (ignored)

publish releases

Initialize publishing the package to both PyPI and conda-forge.

  • pypi
  • conda-forge

investigate the contents of the auxiliary files

The "facility related" data records in the SAR leader file contain several file dumps. In the absence of information about the content / format, these were skipped in #29, but we should still investigate to see whether they contain useful information.

documentation

With the reader being close to completion, it is time to add a rendered user documentation (using RTD).

organize the metadata

Metadata can come from 4 different kinds of files:

  • the summary (#23)
  • the volume directory #28
  • the sar leader file (#29)
  • the image files (#18, #22)

All of these need to be organized into a consistent structure.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.