georust / netcdf Goto Github PK
View Code? Open in Web Editor NEWHigh-level netCDF bindings for Rust
License: Apache License 2.0
High-level netCDF bindings for Rust
License: Apache License 2.0
Hi, I think both netcdf and hdf5 supports parallel reads. Glimpsing at the source of this project there seem to be some global locks. Is parallel netcdf supported?
I am interested in the raw xdr encoded data for a dap server. Is it possible to access the raw bytes through this library?
Can someone clarify whether the thread safety corresponds to reading the same file in parallel or multiple reads from different files? Also can the read me please be updated? I did not know how to contact the organisation so I made an issue instead. I apologize if this was wrong.
The "crates.io" version of this package (0.1.0) is from 2015. And still the current version is 0.1.0
Maybe version in Cargo.toml should be increased? At least to 0.2.0?
There are new methods such as 'Variable::values_at', etc. compared to version of 2015.
Look to hdf5-rust
for how this could look like, where the user can specify arbtrary types and derive NcType
for serialisation and deserialisation to netcdf files
Opening a larger file double-caches all metadata, both on the libnetcdf side and on the rust side. We could consider slimming caching on this side, and call into libnetcdf more often. This would also help with initial loading of the file.
Hello, thank you for the work on this project !
By any chance, is there a plan to add a CF time attribute reading/parsing to handle the datetime type ?
I can't find anything on the docs nor on crates.io
I think of something like Julia CFtime or Python CFtime
I believe it would be a great feature for the geophysical field
I'm considering using this library for Machine Learning purposes, as I need to speed up loading random samples from a large netcdf file, which in python is prohibitively slow. However, for the sake of speed, it would be nice if the array_at
, values_at
and all of the other interfaces have also methods to which you can provide an already preallocated memory array and it will fill in the values there, rather than allocating a new one.
It seems the netcdf-c
developers write files outside of the build directory when building their release. These writes should be located, and patches should be sent upstream to fix this.
Alternatively a PR can be opened on docs.rs
for inclusion of libnetcdf-dev
Add two new functions to Variable
for putting and getting raw bytes
Hi guys,
I try to fetch the variables from the a .nc file but I got totally different values instead of the expected. Do you have any idea how can I solve?
let nc_file = netcdf::open(file).unwrap();
let longitude = nc_file.variable("longitude").unwrap();
let latitude = nc_file.variable("latitude").unwrap();
let time = nc_file.variable("time").unwrap();
let value = nc_file.variables().last().unwrap();
let longitude_values = longitude.values::<f32, _>(..).unwrap();
let latitude_values = latitude.values::<f32, _>(..).unwrap();
let time_values = time.values::<i32, _>(..).unwrap();
let value_values = value.values::<f32, _>(..).unwrap();
This is the data structure of the netcdf file:
Would be nice to see a more ergonomic way for the user to set the size of a variable or the slice size. We could borrow the implementation from the hdf5
crate.
This would enable the following patterns
var.get((.., 4..2, 5));
var.get(..); // everything
var.put((4, 5, ..));
var.put(4); // scalar put
varbuilder.shape((.., 50, 80)); // one unlimited dimension
Hey there! Not sure if this is an issue per say, but I'd like some advise with serializing large files to disk. Right now, I'm constructing a file on the order of 8GB, which takes ~20s to write to disk. Given the NVME drive in this machine, I'd expect that to take more like 8s. I know the standard tricks here in Rust IO land of using a buffered writer don't apply since the netcdf library is handling IO, so I'm wondering what I could do to improve performance.
Here is the code where I do this:
Thanks!
Everything’s in the title !
It’d be nice to be able to chat in real-time about dev, and support, imho.
When reading values from a variable, it should be possible to get a lazy-loading iterator over chunks.
Implementation details:
new method: fn values_chunked(start, chunklength, &mut buffer) -> ChunkIterator { }
struct ChunkIterator {
start: ()
buflen: ()
buffer: ()
}
The global mutex is in no way ideal when reading/writing from multiple threads, and could be split into several (global/per file), or replaced by RWlocks.
This requires investigation into where netcdf does something thread unsafe, and limit the locking to this part. Should also investigate where HDF5 might be problematic
One could also integrate bindings to https://github.com/Parallel-NetCDF/PnetCDF, but this limits the formats to CLASSIC
Hi folks! Thank you for your hard work on developing and maintaining this crate.
I've just encountered a problem on macOS 12.5.1 that the netcdf-sys
crate is not correctly linking with libnetcdf
. I have tried building correctly with libnetcdf installed both via Homebrew and conda. The result is that the directory that is added to the linker has an extra /lib
on the end. I have tracked this back to this line in the netcdf-sys build.rs.
Line 255 in bec73a9
By my intuition /lib
should already be added by either NcInfo::from_path()
, or directly from nc-config
. Is this necessary for other platforms? Have I come across an edge case? Your input would be appreciated. :)
We can use a more up do date version of netcdf-c to enable later bindings from #90
Hello, I've been having this error thrown by rust-analyser since trying to use this crate. I'm not overly familiar with netcdf files so I'm not exactly sure what I need to install/ where to install it from. (honestly when I look up HDF5 I'm not sure which site to even open up haha)
I am trying to compile v0.8.0 on a linux docker image running on an M1 processor.
There are lots of errors like:
--> /root/.cargo/registry/src/github.com-1ecc6299db9ec823/netcdf-0.8.0/src/variable.rs:682:15
|
682 | unsafe fn from_ptr(ptr: *mut i8) -> Self {
I have tried lots of things, including updating the library to 0.8.0, but there seems to be a separate problem here.
Currently, I am setting up the dependencies as follows:
ENV CONDA_DIR=/opt/conda
ENV PATH=${CONDA_DIR}/bin:$PATH
ENV HDF5_DIR=${CONDA_DIR}
ENV NETCDF_DIR=${CONDA_DIR}
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-$(arch).sh -O ~/miniconda.sh && /bin/bash ~/miniconda.sh -b -p ${CONDA_DIR}
RUN conda install -y -c conda-forge libnetcdf=4.8.1 hdf5=1.12.1
libnetcdf=4.9.2 brings hdf5=1.14.0 which isn't supported, so I am forcing 4.8.1.
On the host machine I just installed dependencies with brew install nco
and everything just works. Linux x86_64 also works fine.
Cross-compiling on CI also brings in a different set of problems, but I'm not sure if I should open a separate issue on those.
Hello again! Another quick issue/question -
Is there an equivalent to nc_sync
/ flushing to disk in this rust implementation? I'm getting some errors occasionally that look like
HDF5.API.H5Error: Error reading dataset /voltages
libhdf5 Stacktrace:
[1] H5FD_read: Invalid arguments to routine/Address overflowed
addr overflow, addr = 16803176, size = 3656, eoa = 16803176
when trying to read from other programs. Looking around this seems to be the result of an incomplete write.
Hi,
Since it's possible to pull out data from netCDF files as ndarray I wanted to ask if there is any possibility to add/append data as ndarray?
I use an ndarray to calculate stuff with my data and it seems that I have no possibility to put the multidimensional back into the netcdf file. I'm pretty new to the Rust world and ndarray seems a bit complex, but I don't see any possibility to do that.
One can set the endianess per variable in the dataset. This should be exposed to the user of this wrapper.
Hi,
I was wondering if you would be open to pull requests if I were to help write up some documentation. I figure it would be generated using rustdoc
and be hosted using github-pages.
I'm new to Rust, but I thought I could at least do this much to help out. 😄
Hello !
I'm trying the examples/ncdump.rs bin to read files generated from python by xarray.Dataset.to_netcdf
:
https://xarray.pydata.org/en/stable/generated/xarray.Dataset.to_netcdf.html
Line 71 in ca0970e
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Netcdf(-45)', netcdf-0.5.2/src/variable.rs:138:56
The -45
being NC_EBADTYPE
from https://github.com/georust/netcdf/blob/71b41fe/netcdf-sys/src/netcdf_const.rs#L109, I suppose.
I added a catch_unwind to get more details: it seems to fail on types coming from numpy: fixed-length sequence of chars (dtype('S1')
, dtype('S4')
, etc.)
Any idea on where to go from there ?
Thanks a lot !
It would be nice to have the option to include a static version of netcdf with an application. This would be a rather complicated task, as it would also require to either link to a static libhdf5
or build this from source too.
It seems that in 0.8.0 we need to pass a type template which implements a new trait that is only implemented for numbers.
How can I load variables with strings as their values?
Let us prepare for a new release, this one will introduce breaking changes.
Since the last release we have done:
netCDF
indexmap
as a direct dependencyAre there any more breaking changes we would like to perform before releasing?
Platform: macOS 14.2 apple silicon
I am trying to use netcdf5 in a Tauri application. I'm starting with the basic print file example but it's running into issues at the build step with the following error:
Compiling tauri v1.5.4
error: failed to run custom build command for `hdf5-sys v0.8.1`
Caused by:
process didn't exit successfully: `/Users/nakaj/Documents/sandbox/nc_analyzer/metro_nc/src-tauri/target/debug/build/hdf5-sys-58f0def845f04fb4/build-script-build` (exit status: 101)
--- stdout
Attempting to find HDF5 via Homebrew (any version)...
--- stderr
thread 'main' panicked at /Users/nakaj/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hdf5-sys-0.8.1/build.rs:548:13:
Unable to locate HDF5 root directory and/or headers.
I checked home-brew cellar and verified I already have hdf5 v1.14.3 installed.
Even after commenting out all the example code, and only importing the library, the build error persists
resolved the issue by enabling the "static" feature in the Cargo.toml:
netcdf = { version = "0.8.3", features = ["static"] }
That said, it would be nice to have some detailed instructions or a link to them for those who don't want to use a binary blob.
Hey everyone,
we've been using this library for a while now at GEOMAR and we've noticed that apparently in some cases the (SHA1-)hash of the file we've opened with the library changes, even though we are just reading from it.
Is this due to some file-locking-flag in the file itself or is it some other problem?
We've also had our application crash for unrelated reasons a few times and ended up with corrupted netCDF-files we were reading from, which refused to be opened for one launch, but worked fine again on the next launch.
Sadly most of the files we've been using are way to big to ever attach them here, but for example this happened with a checkout of the GEBCO-2019-dataset.
Any idea why this could be happening?
Thanks in advance!
Current release of netcdf-src
will fail since it requires creating some configure files in directory which we have excluded. Short term solution might be to create a fork/patch of netcdf-c
which fixes this, long term solution will be to get this included upstreams and released in the next version of netcdf-c
.
Thanks @fstaebler for reporting this!
The docs are not very clear on how to use the scale and offset factors. Can you please add these to the docs? Or a simple example? Thank you!
Axect/Peroxide#16 reports a problem where the linker could not find the corrrect library to link on Windows 10 for an official install
Tarpaulin can be used for testing code coverage. This could be integrated into the travis CI
Paths can contain sequences which are not UTF8 compatible, which the current implementation won't allow.
netcdf-c
includes some useful functionality (nc_initialize
, working with mmap-ed files) which would be useful to expose. This should ideally be under a feature flag. Should consider which version we are linking to for dynamic linking, and only expose the available functionality.
The nc_def_var_deflate()
function has an argument to enable the HDF5 shuffle filter, which sometimes help improve compression ratios.
This function is called by the compression()
method for a VariableMut
and currently, the shuffle
value is hard-coded to false
. Could this setting be exposed? For example:
pub fn compression(&mut self, deflate_level: nc_type, shuffle: bool) -> Result<()> {}
Certain netcdf features are only available on certain versions of netcdf. netcdf-sys
should parse the output of nc-config
to determine which features are active and conditionally compile bindings for them.
rustc 1.42.0-nightly (859764425 2020-01-07)
$ cargo test
Updating crates.io index
Downloaded libc v0.2.66
Downloaded rand v0.7.3
Downloaded structopt v0.3.7
Downloaded ndarray v0.13.0
Downloaded getrandom v0.1.14
Downloaded atty v0.2.14
Downloaded num-complex v0.2.4
Downloaded unicode-width v0.1.7
Downloaded num-traits v0.2.11
Downloaded rawpointer v0.2.1
Downloaded matrixmultiply v0.2.3
Downloaded itertools v0.8.2
Downloaded structopt-derive v0.4.0
Downloaded num-integer v0.1.42
Downloaded autocfg v1.0.0
Downloaded proc-macro2 v1.0.7
Downloaded syn v1.0.13
Downloaded unicode-segmentation v1.6.0
Downloaded proc-macro-error v0.4.4
Downloaded rustversion v1.0.1
Downloaded proc-macro-error-attr v0.4.3
Downloaded syn-mid v0.4.0
Compiling proc-macro2 v1.0.7
Compiling autocfg v1.0.0
Compiling unicode-xid v0.2.0
Compiling libc v0.2.66
Compiling syn v1.0.13
Compiling getrandom v0.1.14
Compiling cfg-if v0.1.10
Compiling ppv-lite86 v0.2.6
Compiling bitflags v1.2.1
Compiling netcdf-sys v0.2.1 (/home/norru/Projects/3rdParty/rust-netcdf/netcdf-sys)
Compiling unicode-segmentation v1.6.0
Compiling unicode-width v0.1.7
Compiling ndarray v0.13.0
Compiling either v1.5.3
Compiling rawpointer v0.2.1
Compiling vec_map v0.8.1
Compiling strsim v0.8.0
Compiling ansi_term v0.11.0
Compiling remove_dir_all v0.5.2
Compiling lazy_static v1.4.0
Compiling matrixmultiply v0.2.3
Compiling textwrap v0.11.0
Compiling itertools v0.8.2
error: array lengths can't depend on generic parameters
--> /home/norru/.cargo/registry/src/github.com-1ecc6299db9ec823/matrixmultiply-0.2.3/src/sgemm_kernel.rs:223:40
|
223 | let mut ab = [_mm256_setzero_ps(); MR];
| ^^
error: array lengths can't depend on generic parameters
--> /home/norru/.cargo/registry/src/github.com-1ecc6299db9ec823/matrixmultiply-0.2.3/src/sgemm_kernel.rs:418:40
|
418 | let mut cv = [_mm256_setzero_ps(); MR];
| ^^
error: array lengths can't depend on generic parameters
--> /home/norru/.cargo/registry/src/github.com-1ecc6299db9ec823/matrixmultiply-0.2.3/src/sgemm_kernel.rs:473:44
|
473 | let mut ab: [[T; NR]; MR] = [[0.; NR]; MR];
| ^^
error: array lengths can't depend on generic parameters
--> /home/norru/.cargo/registry/src/github.com-1ecc6299db9ec823/matrixmultiply-0.2.3/src/sgemm_kernel.rs:473:39
|
473 | let mut ab: [[T; NR]; MR] = [[0.; NR]; MR];
| ^^
error: array lengths can't depend on generic parameters
--> /home/norru/.cargo/registry/src/github.com-1ecc6299db9ec823/matrixmultiply-0.2.3/src/dgemm_kernel.rs:235:40
|
235 | let mut ab = [_mm256_setzero_pd(); MR];
| ^^
error: array lengths can't depend on generic parameters
--> /home/norru/.cargo/registry/src/github.com-1ecc6299db9ec823/matrixmultiply-0.2.3/src/dgemm_kernel.rs:711:40
|
711 | let mut cv = [_mm256_setzero_pd(); MR];
| ^^
error: array lengths can't depend on generic parameters
--> /home/norru/.cargo/registry/src/github.com-1ecc6299db9ec823/matrixmultiply-0.2.3/src/dgemm_kernel.rs:786:44
|
786 | let mut ab: [[T; NR]; MR] = [[0.; NR]; MR];
| ^^
error: array lengths can't depend on generic parameters
--> /home/norru/.cargo/registry/src/github.com-1ecc6299db9ec823/matrixmultiply-0.2.3/src/dgemm_kernel.rs:786:39
|
786 | let mut ab: [[T; NR]; MR] = [[0.; NR]; MR];
| ^^
error: aborting due to 8 previous errors
error: could not compile `matrixmultiply`.
warning: build failed, waiting for other jobs to finish...
error: build failed
Hello, and thank you for creating this lib! 💯
As there is no documentation to refer to, I see that I can (and do) open a file using netcdf::open
, but is there a way to open from a buffer / various compressed formats? Right now I would like to have the files saved in gzip format, but unsure how to load them directly into netcdf
Could you make a release with the ndarray
dependency updated to 0.14.0?
Without pinning my crate's ndarray
dependency to ^0.13.0
, it looks like mixing-and-matching versions doesn't work:
error[E0308]: try expression alternatives have incompatible types
--> io/src/lib.rs:125:9
|
125 | / $file
126 | | .variable($varname)
127 | | .ok_or(FileError::BadFormat(format!(
128 | | "'{}' variable not present",
... |
132 | | .into_dimensionality::<$dim_type>()
133 | | .map_err(|_e| FileError::BadFormat(format!("Cannot reshape '{}' array", $varname)))?
| |_______________________________________________________________________________________________^ expected struct `ArrayBase`, found struct `ndarray::ArrayBase`
|
::: io/src/file.rs:217:22
|
217 | Some(nc_read_variable!(input_file, "angle", f32, Ix1))
| --------------------------------------------- in this macro invocation
|
= note: expected struct `ArrayBase<OwnedRepr<f32>, Dim<[usize; 1]>>`
found struct `ndarray::ArrayBase<ndarray::data_repr::OwnedRepr<f32>, Dim<[usize; 1]>>`
= note: perhaps two different versions of crate `ndarray` are being used?
= note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)
Hi,
coverage badges are broken due to org change. Could someone from georust take a look ? (logging in codecov and coveralls and adding repo should be enough)
Thanks,
This issue was automatically generated. Feel free to close without ceremony if
you do not agree with re-licensing or if it is not possible for other reasons.
Respond to @cmr with any questions or concerns, or pop over to
#rust-offtopic
on IRC to discuss.
You're receiving this because someone (perhaps the project maintainer)
published a crates.io package with the license as "MIT" xor "Apache-2.0" and
the repository field pointing here.
TL;DR the Rust ecosystem is largely Apache-2.0. Being available under that
license is good for interoperation. The MIT license as an add-on can be nice
for GPLv2 projects to use your code.
The MIT license requires reproducing countless copies of the same copyright
header with different names in the copyright field, for every MIT library in
use. The Apache license does not have this drawback. However, this is not the
primary motivation for me creating these issues. The Apache license also has
protections from patent trolls and an explicit contribution licensing clause.
However, the Apache license is incompatible with GPLv2. This is why Rust is
dual-licensed as MIT/Apache (the "primary" license being Apache, MIT only for
GPLv2 compat), and doing so would be wise for this project. This also makes
this crate suitable for inclusion and unrestricted sharing in the Rust
standard distribution and other projects using dual MIT/Apache, such as my
personal ulterior motive, the Robigalia project.
Some ask, "Does this really apply to binary redistributions? Does MIT really
require reproducing the whole thing?" I'm not a lawyer, and I can't give legal
advice, but some Google Android apps include open source attributions using
this interpretation. Others also agree with
it.
But, again, the copyright notice redistribution is not the primary motivation
for the dual-licensing. It's stronger protections to licensees and better
interoperation with the wider Rust ecosystem.
To do this, get explicit approval from each contributor of copyrightable work
(as not all contributions qualify for copyright, due to not being a "creative
work", e.g. a typo fix) and then add the following to your README:
## License
Licensed under either of
* Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
* MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
### Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted
for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any
additional terms or conditions.
and in your license headers, if you have them, use the following boilerplate
(based on that used in Rust):
// Copyright 2016 rust-netcdf developers
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.
It's commonly asked whether license headers are required. I'm not comfortable
making an official recommendation either way, but the Apache license
recommends it in their appendix on how to use the license.
Be sure to add the relevant LICENSE-{MIT,APACHE}
files. You can copy these
from the Rust repo for a plain-text
version.
And don't forget to update the license
metadata in your Cargo.toml
to:
license = "MIT/Apache-2.0"
I'll be going through projects which agree to be relicensed and have approval
by the necessary contributors and doing this changes, so feel free to leave
the heavy lifting to me!
To agree to relicensing, comment with :
I license past and future contributions under the dual MIT/Apache-2.0 license, allowing licensees to chose either at their option.
Or, if you're a contributor, you can check the box in this repo next to your
name. My scripts will pick this exact phrase up and check your checkbox, but
I'll come through and manually review this issue later as well.
@milesgranger suggested in #22 that we create an org for this repo or find another org willing to take ownership of it. I'm in support of this - my time is limited and I haven't been keeping up with Rust since I wrote the original version of this crate in 2015.
Any suggestions or opinions on the details here?
Error code is Error: Netcdf(2)
.
I tried open with options but no luck, and I can read it from python.
Below is the nc file(zipped to upload to github) I want to read.
2009.01-04.nc.zip
Hi,
First of all thanks for the nice crate. It‘s very helpful for my task to efficiently destagger Arakawa C-Grids.
My problem is to destagger some variables in a netCDF file, write them out and copy the rest of the input file into the output.
Since you use HashMap, all the variables, attributes and dimension end up in random order if I iterate over them. Although it‘s not problem thats technically relevant, the files, especially dimensions, look very confusing if you look into the file with ncdump for example.
So I thought of implementing it by using a BTreeMap but I‘m not sure if you would accept a pull request on this. I read there are some performance issues associated with this.
What are your thoughts?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.