noaa-ocs-hydrography / kluster Goto Github PK
View Code? Open in Web Editor NEWA distributed multibeam processing system built using the Pangeo ecosystem
License: Creative Commons Zero v1.0 Universal
A distributed multibeam processing system built using the Pangeo ecosystem
License: Creative Commons Zero v1.0 Universal
See dask progressbar
Should support GUI and console apps. Should not lock up GUI, threaded progress only I think
Allow user to set the chunksize for a new run, reload_data would then read the chunksize and set the attribute within Fqpr
I am not able to process any multibeam file. The software was installed following carefully the provided instructions and the data could be imported without any problem but it fails to process. I have tried on 2 different machines (Windows) and using multiple Kongsberg .all files (from EM2040, EM3002, EM302, and EM122). See the attached log for more details
Hi @ericgyounkin ,
I have noticed some unexpected behaviour when using the filter plugin functionality. It has to do with when running in 'Points View' mode and when we have data selected over multiple lines. Our filter tool works by creating 3d 'chips' of data, you would expect that a larger area results in more chips however as you see from the example below when I select similar sized subsets I can very different numbers of chips when the subset spans multiple lines:
Subset of single line - results in 12 chips in our filter
Similar sized subset crossing 2 lines - 195 chips for our model!
I continue to try and debug but was wondering if you had any insights on why this could be happening?
Many thanks.
Look at replacing with vgrid
Should support chunking of soundings, or some kind of parallelized input with a single output object
Examine cloud friendly formats cloud optimized geotiff
currently have test_fqpr_generation with tests
need to configure Travis to use this and build with tests on push
Kongsberg uses a 16bit counter field. As you log data and the counter number gets to the 16bit limit, it resets and starts over. Kluster converts multiple multibeam files into one dataset, concatenating along the time dimension. You can end up with duplicate ping counter numbers this way, if the lines happen to include one of these counter resets.
Need to fix this for reform_vars related methods to work (where we use counter and time to reform pings)
Hi, I'm getting the following error when trying to import/convert several .all files:
Using existing local cluster client...
<Client: 'tcp://127.0.0.1:55569' processes=8 threads=8, memory=63.87 GiB>
****Running Kongsberg .all converter****
1 file(s), Using 1 chunk(s) in parallel
[ ] | 0% Completed | 1.4s
Error running action multibeam
Traceback (most recent call last):
File "HSTB\kluster\gui\kluster_worker.py", line 38, in run
File "HSTB\kluster\fqpr_actions.py", line 270, in execute_action
File "HSTB\kluster\fqpr_actions.py", line 52, in execute
File "HSTB\kluster\fqpr_convenience.py", line 112, in convert_multibeam
File "HSTB\kluster\fqpr_generation.py", line 532, in read_from_source
File "HSTB\kluster\xarray_conversion.py", line 1000, in read
File "HSTB\kluster\xarray_conversion.py", line 1552, in batch_read
File "HSTB\kluster\xarray_conversion.py", line 1314, in _batch_read_sequential
File "distributed\client.py", line 1946, in gather
return self.sync(
File "distributed\utils.py", line 310, in sync
return sync(
File "distributed\utils.py", line 364, in sync
raise exc.with_traceback(tb)
File "distributed\utils.py", line 349, in f
result[0] = yield future
File "tornado\gen.py", line 762, in run
File "distributed\client.py", line 1811, in _gather
raise exception.with_traceback(traceback)
File "HSTB\kluster\xarray_conversion.py", line 110, in _run_sequential_read
File "HSTB\kluster\fqpr_drivers.py", line 173, in sequential_read_multibeam
File "HSTB\drivers\par3.py", line 879, in sequential_read_records
File "HSTB\drivers\par3.py", line 801, in _finalize_records
IndexError: index 1681 is out of bounds for axis 0 with size 1681
OS: WIndows 10
Version: 0.8.8 (same error with v0.8.4)
See Fqpr.return_cast_idx_nearestintime
Should probably include a nearest in time/distance type method that has been successful in operational hydro.
Might be a better option out there, but GSF would probably be better than exporting to csv
This might already support reading/writing
https://github.com/schwehr/generic-sensor-format
Spec can be found here
https://www.leidos.com/products/ocean-marine
Integrate some of the stuff in scipy/numpy for filtering/interpolation
use the basic plots type widget to select data variables/time periods
save results to disk
Hi, is it possible to run the UI in linux (ubuntu)?
I cannot build the docker image using the docker file located in the root folder. It seems that the last 2 lines, besides uncommenting, need changing "conda run" for "RUN conda". But after it and when trying to build the docker image from get docker file, I am getting this error
executor failed running [conda run -n kluster_test /bin/bash -c conda -n kluster_test pip install git+https://github.com/noaa-ocs-hydrography/kluster.git#egg=hstb.kluster]: exit code: 1
See the log
(base) PS C:\Users\monoc\Downloads\kluster-kluster_0_8_9> docker build -t kluster089 .
[+] Building 1.4s (19/20)
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 1.89kB 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/ubuntu:18.04 0.8s
=> [ 1/17] FROM docker.io/library/ubuntu:18.04@sha256:c2aa13782650aa7ade424b12008128b60034c795f25456e8eb552d0a0f447cad 0.0s
=> CACHED [ 2/17] RUN apt-get update 0.0s
=> CACHED [ 3/17] RUN apt-get install -y git 0.0s
=> CACHED [ 4/17] RUN apt-get install -y wget 0.0s
=> CACHED [ 5/17] RUN apt install libgl1-mesa-glx -y 0.0s
=> CACHED [ 6/17] RUN apt-get install ffmpeg libsm6 libxext6 -y 0.0s
=> CACHED [ 7/17] RUN adduser --disabled-password --gecos "Non-root user" --uid 1000 --gid 100 --home /ho 0.0s
=> CACHED [ 8/17] RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-py38_4.10.3-Linux-x86_64.sh -O ~/minico 0.0s
=> CACHED [ 9/17] RUN echo ". /home/eyou102/miniconda3/etc/profile.d/conda.sh" >> ~/.profile 0.0s
=> CACHED [10/17] RUN conda init bash 0.0s
=> CACHED [11/17] RUN mkdir /home/eyou102/kluster 0.0s
=> CACHED [12/17] WORKDIR /home/eyou102/kluster 0.0s
=> CACHED [13/17] RUN conda update --name base --channel defaults conda 0.0s
=> CACHED [14/17] RUN conda create -n kluster_test python=3.8.12 0.0s
=> CACHED [15/17] RUN conda install -c conda-forge qgis=3.18.3 vispy=0.9.4 pyside2=5.13.2 gdal=3.3.1 h5py python-geohash 0.0s
=> ERROR [16/17] RUN conda -n kluster_test pip install git+https://github.com/noaa-ocs-hydrography/kluster.git#egg=hstb.k 0.6s
[16/17] RUN conda -n kluster_test pip install git+https://github.com/noaa-ocs-hydrography/kluster.git#egg=hstb.kluster:
#19 0.540 ERROR conda.cli.main_run:execute(33): Subprocess for 'conda run ['/bin/bash', '-c', 'conda -n kluster_test pip install git+https://github.com/noaa-ocs-hydrography/kluster.git#egg=hstb.kluster']' command failed. (See above for error)
#19 0.540
#19 0.540 CommandNotFoundError: No command 'conda kluster_test'.
#19 0.540
#19 0.540
executor failed running [conda run -n kluster_test /bin/bash -c conda -n kluster_test pip install git+https://github.com/noaa-ocs-hydrography/kluster.git#egg=hstb.kluster]: exit code: 1
This is kind of in progress, should allow the inclusion of kongsberg/applanix rms error sources.
Build a dictionary attribute that has an integer key with the value being a description of the last process run.
Allow you to query the whole dataset to ensure that each sounding is up to date. Also lets you query the processing history of a single sounding.
Important if we allow selection of soundings for processing (and not just whole lines)
Similar to Charlene, use the understanding of things like project EPSG to ensure user/automated process does not process with a different EPSG
Getting this error when trying to process a .all file. Unlike #79 I can import/convert the file but it is the next processing/georefererencing stage which fails:
****Building tx/rx vectors at time of transmit/receive****
Operating on system serial number = 275
using installation params 1616079030
Traceback (most recent call last):
File "/snap/pycharm-community/267/plugins/python-ce/helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec
exec(exp, global_vars, local_vars)
File "<input>", line 1, in <module>
File "/home/david/kluster/HSTB/kluster/fqpr_convenience.py", line 308, in process_multibeam
fqpr_inst.get_orientation_vectors(initial_interp=orientation_initial_interpolation, subset_time=subset_time)
File "/home/david/kluster/HSTB/kluster/fqpr_generation.py", line 2208, in get_orientation_vectors
self.generate_starter_orientation_vectors(prefixes, timestmp)
File "/home/david/kluster/HSTB/kluster/fqpr_generation.py", line 623, in generate_starter_orientation_vectors
rx_heading = abs(float(self.multibeam.xyzrph[txrx[1] + '_h'][tstmp]))
KeyError: 'rx_h'
File info:
FQPR: Fully Qualified Ping Record built by Kluster Processing
-------------------------------------------------------------
Contains:
2 sonar heads, 18000 pings, version 0.8.9
Start: Thu Mar 18 14:50:30 2021 UTC
End: Thu Mar 18 14:55:30 2021 UTC
Minimum Latitude: <omitted> Maximum Latitude: <omitted>
Minimum Longitude: <omitted> Maximum Longitude: <omitted>
Minimum Northing: Unknown Maximum Northing: Unknown
Minimum Easting: Unknown Maximum Easting: Unknown
Minimum Depth: Unknown Maximum Depth: Unknown
Current Status: converted complete
Sonar Model Number: em2040_dual_rx
Primary/Secondary System Serial Number: 275/281
Horizontal Datum: 32630
Vertical Datum: waterline
Navigation Source: Unknown
Contains SBETs: False
Sound Velocity Profiles: 1
Kongsberg has launched their new MBES for shallow water, the EM 2042. Could you Kluster support it? I am attaching a sample zipped file .
There is also the just released Kmall rev J (https://www.kongsbergdiscovery.online/sis/kmall/html/index.html) . although that EM2042 file was logged with the previous datagram revision.
replace with a progress bar that increments once for each chunk to reduce text overload
WobbleTest will give you an indication of what might be wrong with your data using cross-correlation plots
Need to provide clear guidance (i.e. you have a 9ms Latency value) instead of just showing plots
"Sonar model not understood" issue with the "em124" data
Will be useful to provide the user a way to manually select the PORT number where to start a dask cluster - so that we can expose such port in the docker environment.
I had a look into dask_helpers.py but not sure which method I should modify - seems the code needs to split the address string in IP:PORT - the values for IP:PORT can be then stored into- and retrieved from- kluster_variables.py
Support for reading .all seabed image 89 datagram (sample amplitudes)
Support for reading .kmall MRZ reflectivity, either
backscatter calibration values - different scalar values for freq/mode
order of inputs in Transformer.transform appears to depend on the epsg provided. Needs more testing.
Need to rework beam index in soundings to be based on the ping-wise beam number. Should either:
or
All kluster modules should probably use the same logger instance I think, since we drive to file. Fix that across all modules.
Hello,
I have been practising importing em122 files (.all), and it was going well. So I decided to import and process a larger batch (approximately 28 files) and then kept getting this error:
C:... \anaconda3\envs\kluster_test\lib\site-packages\zarr\util.py", line 526, in check_array_shape
raise ValueError('parameter {!r}: expected array with shape {!r}, got {!r}'
ValueError: parameter 'value': expected array with shape (7530,), got (7752,)
2023-10-02 16:42:26,818 - INFO - kluster_action: no data returned from action execution
Any ideas? Many thanks!
PS the software is super cool :)
Hi @ericgyounkin,
As you're aware we are developing algorithms/models for cleaning soundings from multibeam data. I noticed #23 and we have been mainly using GSF as a format to share data so being able to read GSF files would great. We are always exploring tools that might help us iterate our algorithms faster. The inputs to the models are essentially dask dataframes of x,y,z data. The two things I would be looking to try first would be:
My feeling is that it shouldn't be too hard as you are using xarray datasets which are easily converted to dask data frames. Any pointers you can give to help us try this out, such us code structure/architecture, how to interact with the data or create some basic gui elements, would be much appreciated. One specific question I have:
Thanks
found order of coordinates in transformer depends on CRS being from proj4 string or EPSG
from pyproj import Transformer, CRS
manual_crs = CRS.from_proj4('+proj=utm +zone=10 +ellps=GRS80 +datum=NAD83')
georef_transformer = Transformer.from_crs( manual_crs.geodetic_crs, manual_crs)
georef_transformer.transform(40, -120)
Out[5]: (inf, inf)
georef_transformer.transform(-120, 40)
Out[6]: (756099.6479720183, 4432069.056784666)
epsg_crs = CRS.from_epsg(26910)
georef_transformer = Transformer.from_crs(epsg_crs.geodetic_crs, epsg_crs)
georef_transformer.transform(40, -120)
Out[9]: (756099.6479720183, 4432069.056784666)
georef_transformer.transform(-120, 40)
Out[10]: (inf, inf)
manual_crs.to_epsg()
Out[16]: 26910
kmall MRZ datagrams contain Ifremer quality factor. Leaves 0.0 for quality factor when the computation fails. Currently that drives sonar uncertainty to zero for that sounding. This is probably the opposite of what we want. Should we artifically drive up uncertainty as a way of flagging the sounding? Should we just flag it rejected in the detection type? Are they already rejected?
incorporates:
should probably upload a test data file (i have a 9MB file that we can use)
Look at using dask + binder to compose the notebook
Still need to actually test Caris SVP based profile with svcorrect. The SoundSpeedProfile will work, but there might be some underlying issues that I haven't anticipated.
see this paper for one example Ive found
Hi @ericgyounkin,
This happens intermittently. I am trying to pin down the conditions under which it happens:
I apply a filter, in this case the angle filter -25+25. When the filter finished the points are redrawn like so:
However if I reselect the subset, the data is correct, indicating that the data written to disk is correct.
As I said it does not happen all the time. It happens with other filters so not related to the angle filter. It does seem to be related to having multiple lines selected as I don't think it happens if all the data is from one line. I will keep trying to get insight into the problem. Thanks.
involves working on open source CUBE + Numba, integration with Bathygrid, and expanding gridding in Kluster GUI/convenience for CUBE option
If the processed SBET is nad83 export, need to save that datum to the Fqpr object to then feed georeferencing.
Currently have fqpr_visualizations FqprVisualizations that can provide plots and animations
Can animate beam vectors and vessel orientation here
gui.dialog_vesselview can display a vessel as a 3d model
Combining all these things and you basically get a vessel + multibeam animation. Look at using vispy to compose these elements as well as the sv corrected offsets to get the full view.
Can currently use subset_time with the main processing steps in fqpr_generation to only process a subset of the data. See fqpr_convenience reprocess_sounding_selection for an example of this. This method will not allow you to write back to disk however. Zarr write currently is based on the data index, not the time, so a write of 100 pings will write to the first 100 pings of the zarr store. Not good, need writes to be based on time for this to work.
Need a way to generate a report describing the dataset(s) at the Fqpr level and the project level
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.