ubarsc / pylidar Goto Github PK

View Code? Open in Web Editor NEW

14.0 4.0 10.0 2.66 MB

A set of Python modules which makes it easy to write lidar processing code in Python

Home Page: http://www.pylidar.org

License: GNU General Public License v3.0

Python 65.89% C++ 29.70% C 4.41%

rios lidar las

pylidar's People

Contributors

Stargazers

Watchers

Forkers

kamiccolo fdbesanto2 rdenham vb6hobbyst7 gillins armstonj hmworsham

pylidar's Issues

Plans on creating a SPDV4 driver in C/C++?

Original report by Caio Hamamura (Bitbucket: caiohamamura, GitHub: caiohamamura).

Is there a SPDV4 driver for C/C++? I’d like to use it in my algorithms to convert from and to spdv4 format, as I’m currently working with a custom binary implementation and would like to switch to a more standard format. SPDV4 could be the format standard if it had a solid c/c++ library. And in the future support for compressing algorithms.

‌

pylidar_index changes the number of points in the file

Original report by Neil Flood (Bitbucket: neilflood, GitHub: neilflood).

Running pylidar_index on an input SPD V4 file creates an output file which has a different number of points. This does not seem good. It may be harmless, but it could also be a sign of something more sinister. Example counts for a single TLS file show the number of points change from 28363782 to 28362382, a loss of 1400 points. It is not yet clear whether the points are missing from the file, or simply that the count is incorrectly recorded, but neither seems like a good thing.

spdv4 NUMBER_OF_POINTS as float64

Original report by Neil Flood (Bitbucket: neilflood, GitHub: neilflood).

Somewhere in the conversion from a Riegl rxp file to an SPD V4 file, the NUMBER_OF_POINTS gets converted to a float64. This is likely to be problematic. It is defined in the format as uint64, and so this means that at some point in the process, there is an implicit type conversion going on. This is unlikely to be handling precision in the right ways, and I suspect it responsible for a loss of precision when the number of points is large. The same type conversion appears to also occur in conversion from .las to SPD V4. NUMBER_OF_PULSES also appears to undergo the same type conversion.

Fix exporting of extra columns to LAS

Original report by Sam Gillingham (Bitbucket: gillins, GitHub: gillins).

Currently when exporting to LAS, if there are columns that aren't part of the LAS spec they get saved as extra columns or 'attributes'. However, when reading the file back in with PyLidar you get a segmentation fault. Changing the version to LAS 1.4 seems to help a bit but other errors emerge.

Seems that using LAStools for this isn't easy. I've tried a number of things, but we need someone who has experience of the library to give us some assistance.

pylidar_index seems to double the number of points

Original report by Neil Flood (Bitbucket: neilflood, GitHub: neilflood).

Running pylidar_index on a TLS file seems to double the number of points in the resulting file. The number is almost exactly double (e.g. from 28363782 to 56726164), when reported by pylidar_info (NUMBER_OF_POINTS field). Seems weird.

When run on a tile of airborne lidar, the number of points changes a little, e.g. from 665022 to 690044, which is nothing like as much increase, but still weird.

Possible memory leak in interpolation

Original report by Neil Flood (Bitbucket: neilflood, GitHub: neilflood).

This probably should be an issue under pynninterp, but here will do for now.

Nick has some evidence to suggest a memory leak when running interpolations. He was running something Jaz set up to do interpolations across a large number of tiles (with a suitably small window size), and it grew and grew, to many Gb. This suggests that there is a memory leak, with something not being freed up between windows.

I should try to set up a suitable test loop over the guts of pynninterp, but wanted to log this here, so others might be able to look into it, too.

Cannot open same SPDV4 file twice in one program

Original report by Neil Flood (Bitbucket: neilflood, GitHub: neilflood).

It seems there is some mysterious state information being preserved between calls to lidarprocessor.doProcessing(), within the SPDV4 driver and/or the h5py libraries.

The attached program pylidarFileCloseBug.py demonstrates the problem. It runs two successive processes which operate on the same file. The intention is situations where a first pass through a file is required to calculate some numbers, then a second pass through to update a field with the resulting values. In this small example, we just read some points and do nothing.

The first pass opens the file with READ mode, and works fine. The second pass opens with UPDATE, and fails. If both passes use the same mode (either READ or UPDATE), they both succeed.

I do not know whether the problem is in h5py or pylidar, but it is not purely an h5py problem, because the attached h5py fragment pureH5py.py works fine.

It does not modify the file (as shown by the file modification times), so it is internal to the program, not in the file itself.

I was testing this on the latest tip in the repository (fbea23379b02 - dated Sun Nov 27 15:14:44 2016 +1000).

Any thoughts?

Riegl coordinate transformation must be done as float64

Original report by Neil Flood (Bitbucket: neilflood, GitHub: neilflood).

The current code for handling the Riegl coordinate transformation is set up to do the actual calculation as float32. When world coordinates are UTM, the numbers are large enough that there is a loss of precision at the sub-metre level. For TLS scans such as the Riegl gives us, we need the precision to be more like millimetre. Hence this calculation should be done as float64.

I have put in place a hack to get around it for now, as just changing to had too many implications elsewhere in the file. However, this is what ought to be done, ultimately. I have had to circumvent the use of the pylmatrix code, which is a shame.

Conversion from spd to las error

Original report by muirj (Bitbucket: muirj, ).

Error when converting from spd file to las file. The error states "Last argument needs to be numpy dtype"

No values in output when applying processor directly to a LAS file.

Original report by Daniel Clewley (Bitbucket: danclewley, GitHub: danclewley).

I think there is a problem applying the LiDAR processor directly to LAS files as the processor example from http://pylidar.org/en/latest/processorexamples.html produces a blank image when applied directly to a LAS file but the expected result when the file is converted to SPD format first.

I've tested on a couple of LAS files I have and apl1dr_x509000ys6945000z56_2009_ba1m6_pbrisba.las from the pylidar test data and get the same result. For all files I have generated a spatial index first using lasindex. I'm using pylidar 0.3.0 installed via conda on Python 3.5 under Linux.

It looks like the problem is getPointsByBins always returns 0 points for the LAS file but not the SPD version. The code below will reproduce the problem by running the same function on a LAS file and an SPD v4 created from the input LAS. Adding a line to print zValues.shape shows no points being found for the LAS file.

import os
import numpy
from rios import cuiprogress
from pylidar import lidarprocessor
from pylidar.toolbox.translate.las2spdv4 import translate
from pylidar.lidarformats import generic

def writeImageFunc(data):
     zValues = data.input_points.getPointsByBins(colNames='Z')
     (maxPts, nRows, nCols) = zValues.shape
     nullval = 0
     if maxPts > 0:
         minZ = zValues.min(axis=0)
         stack = numpy.ma.expand_dims(minZ, axis=0)
     else:
         stack = numpy.empty((1, nRows, nCols), dtype=zValues.dtype)
         stack.fill(nullval)
     data.output_image.setData(stack)


if __name__ == '__main__':

     input_las = 'testdata/apl1dr_x509000ys6945000z56_2009_ba1m6_pbrisba.las'
     imported_spd = os.path.splitext(input_las)[0] + '.spd'

     info = generic.getLidarFileInfo(input_las)

    importedSPD = imported_spd
    translate(info, input_las, imported_spd, epsg=27700,
              pulseIndex='FIRST_RETURN', spatial=True, binSize=2.0, 
              buildPulses=True)

    for in_points in [input_las, imported_spd]:
        out_img = in_points + '.img'

        dataFiles = lidarprocessor.DataFiles()
        dataFiles.input_points = lidarprocessor.LidarFile(in_points,
                                                         lidarprocessor.READ)
        dataFiles.output_image = lidarprocessor.ImageFile(out_img,
                                                          lidarprocessor.CREATE)

        if os.path.splitext(in_points)[-1].lower() != '.spd':
            dataFiles.input_points.setLiDARDriverOption('BIN_SIZE', 2.0)

        controls = lidarprocessor.Controls()
        controls.setProgress(cuiprogress.CUIProgressBar())

        lidarprocessor.doProcessing(writeImageFunc, dataFiles)

Limits of pylidar spatial processing mode

Original report by Neil Flood (Bitbucket: neilflood, GitHub: neilflood).

(Note - also posted this to the discussion group, but I don't know if anyone sees that)

Now that few users have gained some experience in using pylidar for real tasks (it has taken a while), it is starting to become apparent that there are some real limitations with the spatial processing mode of pylidar. This is when each block of data is constructed as an array of binned data, with the first two dimensions representing X and Y position, and the third dimension being a ragged array giving each point which falls within that bin. The ragged array is implemented using a numpy masked array, so that for each bin, the third dimension only extends to as many points actually occur in that bin, and the remaining elements are masked.

The main problems arise with the time taken to construct this array, and the size it can occupy in memory. It would seem that these are both tied to the fact the point density can be quite variable across a scan. This means that for a particular block, there can be a single bin which has many points, and this is used as the maximum size of the third dimension. However, if this number is considerably larger than the average across the block, then it makes the whole array very large in that direction, resulting in considerable "empty" space in the rest of the array. Since the block size (in terms of X and Y) is fixed across a whole call to doProcessing(), it must be small enough that the block with the largest point count in a single bin can be accommodated in memory, however, for some scans (TLS in particular), this can impose a very small block size, in order to fit within available memory.

I am not sure, but I think this also ties in to the run time, as the construction of this array takes considerable time, and with a small block size, I think that much of the data would have to be handled multiple times to select which points apply to a given block. Not sure about that part.

In any case, on TLS scans (where point density is highest in the centre), the memory requirements can go into many Gb, for files which are not that big in the first place. The run times can be many hours to process across such a file.

Jasmine and I put together a non-spatial processing framework, in which the non-spatial mode is used, and the user function does what it requires in spatial terms, just on-the-fly. In this case, the operation was a simple min(Z) operation, and was implemented using a numba jit function to do the per-point section. The memory requirement was tiny, and the run time went from on the order of 12 hours down to around 5 minutes.

Obviously this is a vast improvement.

My feeling is that the spatial processing mode, with the masked arrays, was worth testing out, but I suspect it is not a feasible model for doing this sort of thing, and we are better off using the non-spatial mode and handling the spatial aspects in the particular case, for a given task. Using numba will help greatly in this.

With this in mind, I would like to suggest that we change the default control setting to non-spatial, and shelve all further plans for development on the spatial-mode parts of the code.

It also raises the question (which Nick has raised before) about what purpose the indexing of a file is actually serving for us. The TLS example mentioned above made no use of the index, and runs vastly faster, so it is not clear that indexing files is a useful thing to do.

These are just my impressions, and I recognise that there could be a lot else I don't know about. As always, I emphasize that I am just helping out here, and don't really know anything about lidar, but these results suggest that there are some important questions to consider.

Any thoughts are welcome.

Neil

Create Testsuite

Original report by Sam Gillingham (Bitbucket: gillins, GitHub: gillins).

We need data and test scripts to test the following functionality:

Importing LAS [see testsuite1]
Import LAS/LAZ with spatial index [see testsuite3]
Importing SPDv3 [see testsuite10]
Updating SPDv3 [see testsuite11]
Importing ASCII [see testsuite13 and testsuite14]
Exporting to LAS [see testsuite5] need to add test for export data from eg Riegl to properly test this (non-standard columns as attributes etc).
Updating an SPD file [see testsuite4]
adding a column [see testsuite4]
Interpolation [see testsuite6]
Reading at different bin sizes [see testsuite2]
Reading multiple SPD files at once [see testsuite2]
Building a spatial index [see testsuite1 and others]
overlap [see testsuite6]
setting pixel grid [see testsuite6]
read un-aligned from the spatial index [see testsuite7]
updating a SPD file with info from a raster [see testsuite12]
LAS with extra columns
canopy metrics [see testsuite9]
Importing Riegl .rxp [see testsuite8]
Reading and writing waveforms

LiDARFormatDriverNotFound error

Hi,

I am trying to open a lidar file (LAZ and LAS formats), but the function spatial.readLidarPoints can not read it. The file is not corrupted because it open correctly on ArcMap and QGIS.
Do you have any advice or suggestion? Thanks in advance, JL

`/usr/local/lib/python3.7/dist-packages/pylidar/lidarformats/generic.py in getReaderForLiDARFile(fname, mode, controls, userClass, verbose)
625 # none worked
626 msg = 'Cannot open LiDAR file %s' % fname
--> 627 raise LiDARFormatDriverNotFound(msg)
628
629 def getLidarFileInfo(fname, verbose=False):

LiDARFormatDriverNotFound: Cannot open LiDAR file ./data/lidar_data.las`

ubarsc / pylidar Goto Github PK

pylidar's People

Contributors

Stargazers

Watchers

Forkers

pylidar's Issues

Recommend Projects

Recommend Topics

Recommend Org