rasterio / rasterio Goto Github PK

View Code? Open in Web Editor NEW

2.1K 147.0 523.0 26.35 MB

Rasterio reads and writes geospatial raster datasets

Home Page: https://rasterio.readthedocs.io/

License: Other

Python 74.34% Shell 0.38% Makefile 0.17% Cython 25.05% Dockerfile 0.06%

python gis raster gdal cli cython mapbox-satellite-oss

rasterio's Issues

Remove rasterio.insp

Deletion of the file from scripts. Before 1.0.

Iterate over blocks

It would be a very cool feature if rasterio could transparently handle iteration over blocks. Almost all my project involves raster files which are too large to load into memory.

It could maybe be something like:

with rasterio.open('input.tif') as src:
  kwargs = src.meta
  with rasterio.open('output.tif', 'w', **kwargs) as dst:
    # Iterate over raster blocks from band 1
    for block, data in src.band_blocks(1):
      # Block holds xoffset, yoffset, width, height
      data = data + 1
      dst.write_band(1, data, block)

Add band shortcut to write_band()

Like the one in reproject(). It'll boost rio_cp.

Harmonize indexing throughout the API

When working with NumPy dimensions are in y,x order but rasterio methods expect pixel coordinates in x,y order.

Imho this an unnecessary complication.

Better meta data handling in basic features examples

The current rasterio basic features example expects an proper GeoTiff file as input. The example could be easily used with e.g. JPEG files by changing:

    kwargs = src.meta
    kwargs.update(
        dtype=rasterio.uint8,
        count=1,
        compress='lzw')

to something like:

    try:
        kwargs = src.meta
    except ValueError:
        kwargs ={'driver': 'GTiff', 
                 'height': src.height,
                 'width': src.width }
    kwargs.update(
        dtype=rasterio.uint8,
        count=1,
        compress='lzw')

Add shortcut to image bands

In a nutshell: let methods accept (dataset, bidx) tuples as well as ndarrays.

This would let the polygonization example

    with rasterio.drivers():
        with rasterio.open(raster_file) as src:
            image = src.read_band(1)
        results = (
            {'properties': {'raster_val': v}, 'geometry': s}
            for i, (s, v) 
            in enumerate(
                shapes(image, transform=src.transform)))

be condensed to

    with rasterio.drivers():
        with rasterio.open(raster_file) as src:
            results = (
                {'properties': {'raster_val': v}, 'geometry': s}
                for i, (s, v) 
                in enumerate(
                    shapes((src, 1), transform=src.transform)))

This cuts out 2 copies: source band to ndarray in read_band() and ndarray to GDAL in-memory dataset within shapes().

Driver loading and nested withs

On your blog you describe using with for driver management. What happens in the following case?

with rasterio.drivers():
    with rasterio.drivers():
        pass
    with rasterio.open('rasterio/tests/data/RGB.byte.tif') as f:
        pass

Here, the code is obviously 'wrong', but withs are designed to be nested.

Maybe add a reference count to DriverManager?

Add support for GDAL band masks

See http://trac.osgeo.org/gdal/wiki/rfc15_nodatabitmask. In rasterio, the usage would be this:

# To read
with rasterio.open(file) as src:
    band = src.read_band(1)
    mask = src.read_mask(1)

# To write
with rasterio.open(file, 'w', ...) as dst:
    dst.write_mask(1, mask)

The GDAL functions to use are GDALGetMaskBand and GDALCreateMaskBand.

Allow setting of nodata

Via a creation keyword: nodata=value, applied equally to all bands.

Write a switching from osgeo.gdal doc

Nothing fancy, just an description of what rasterio offers re the GDAL data model and how to go from C style programming with osgeo.gdal to Python style with rasterio.

GDALCreate() is unsupported for many formats

And I don't see how to access a Driver's pfnCreate attribute from the C API, therefore rasterio will for now support opening only GTiffs in 'w' mode.

Write docs for tags and colormaps

GDAL's low-level warp kernel supports warping between image buffers. For rasterio, this means that we can have an interface for warping directly from one Numpy array (or object that we can get a Cython memoryview on) to another backed by existing functions in the GDAL lib.

Such a function exists in scipy.ndimage: http://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.interpolation.geometric_transform.html and it looks like it's mainly in C but one in rasterio could avoid a dependency on Scipy for a process that programmers are well used to getting from GDAL.

I'm thinking a similar interface would be good, something like

def rasterio.warp(input, output_shape=None, output=None, ...):
    ...

which is what rasterio's read_band and write_band follow.

References:

cc @hobu

v0.6 not compatible with Python 2.6: dict comprehension

When using rasterio with Python 2.6, I get the following error:

$ /usr/local/bin/python -c "import rasterio"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.6/site-packages/rasterio/__init__.py", line 8, in <module>
    from rasterio._io import RasterReader, RasterUpdater
  File "_io.pyx", line 12, in init rasterio._io (rasterio/_io.c:20093)
  File "/usr/local/lib/python2.6/site-packages/rasterio/dtypes.py", line 26
    dtype_rev = {v: k for k, v in dtype_fwd.items()}
                        ^
SyntaxError: invalid syntax

A solution could be to change this:

dtype_rev = {v: k for k, v in dtype_fwd.items()}

to:

dtype_rev = dict([(v, k) for k, v in dtype_fwd.items()])

0.5 release

Tag and upload sdist to PyPI.

Missing rasterio/_copy.pyx?

Isn't there supposed to be a rasterio/_copy.pyxin the repo?

Feature Request: Create Numpy Mask from Geometry

We would like to be able to create a numpy mask from a shapely geometry.

Here is an example that is roughly similar in intent to approach referenced in Issue #1 but different in implementation in that it uses gdal.RasterizeLayer instead of PIL ImageDraw (benchmarks comparing the two have not yet been performed):

def mask_from_geometry(ndarray_shape, geometry, projection_wkt, transform, all_touched=False):
    """
    Create a boolean numpy mask from a Shapely geometry.  Data must be projected to match prior to calling this function.
    Areas are coded as 1 inside the geometry, and 0 outside.  Invert this to use as a numpy.ma mask.

    :param ndarray_shape: (rows, cols)
    :param geometry: Shapely geometry object
    :param projection_wkt: the projection of the geometry and target numpy array, as WKT
    :param transform: the GDAL transform object representing the spatial domain of the target numpy array
    :param all_touched: if true, any pixel that touches geometry will be included in mask.
    If false, only those with centroids within or selected by Brezenhams line algorithm will be included.
    See http://www.gdal.org/gdal__alg_8h.html#adfe5e5d287d6c184aab03acbfa567cb1 for more information.
    """

    assert len(ndarray_shape) == 2

    sr = osr.SpatialReference()
    sr.ImportFromWkt(projection_wkt)
    target_ds = gdal.GetDriverByName("MEM").Create("", ndarray_shape[1], ndarray_shape[0], gdal.GDT_Byte)
    target_ds.SetProjection(sr.ExportToWkt())
    target_ds.SetGeoTransform(transform)
    temp_features = ogr.GetDriverByName("Memory").CreateDataSource("")
    lyr = temp_features.CreateLayer("poly", srs=sr)
    feature = ogr.Feature(lyr.GetLayerDefn())
    feature.SetGeometryDirectly(ogr.Geometry(wkb = geometry.wkb))
    lyr.CreateFeature(feature)
    kwargs = {}
    if all_touched:
        kwargs['options'] = ["ALL_TOUCHED=TRUE"]
    gdal.RasterizeLayer(target_ds, [1], lyr, burn_values=[1], **kwargs)
    return target_ds.GetRasterBand(1).ReadAsArray(0, 0, ndarray_shape[1], ndarray_shape[0]).astype(numpy.bool)

We would like to port this to Cython and create a pull request, but first wanted to agree on the parameters for this function. Having to pass in projection_wkt and transform parameters here feels heavy, but I cannot get GDAL to work without them here.

I see that you are not always passing transform in for _features.pyx, and see no reference to setting projection. Are you able to get by without these if you are calling GDAL from Cython instead?

0.7 release

TODO:

In drivers() log whether we're returning a dummy manager

shapes(): transform arg has no effect

Sharp-eyed people noticed this in my tech talk this morning.

Add a bounds attribute and a pixel position method

.bounds would be the extent of the entire raster in its own units (minx, miny, maxx, maxy).

.ul(x, y) proposed name would return the upper left corner coordinates of any x,y pixel.

Clip/Mask Functions

Sean,

First of all. Congrats! Second of all, I could have been using this library over a year ago! :) sigh You create wonderful Geo libraries for Python!

This Issue is mostly to remind myself (and you might be interested to) to contribute the clipping/masking functions of the CCAV Project for which I've been working on for the past year. See: https://bitbucket.org/ccaih/ccav and specifically https://bitbucket.org/ccaih/ccav/src/tip/src/ccav/utils.py

My intention is to contribute these functions to rasterio in some way as I believe they are meaningful and useful to this library and quite "fitting".

cheers
James

Remove six dependency

Currently there is a very limited used of the six module dependency.

In rasterio/__init__.py:

from six import string_types

Looking at the source code, this function from the module can be replaced by:

import sys
if sys.version_info[0] >= 3:
    string_types = str,
else:
    string_types = basestring,

Or, an alternative solution is to define:

if sys.version_info[0] >= 3:
    basestring = str

and replace instances of string_types with basestring.

0.7.1 release

merge #48
update change log, version, readme
tag
upload to pypi

Warn about rasterio.insp going away before 1.0

Put a blurb in the interpreter's banner.

Switch from nose to pytest

Mostly just a documentation issue. Pytest runs the existing tests just fine.

0.5.1 release

Tag 0.5.1
Upload to PyPI

Travis config

Pythons 2.7 and 3.3. Cython 0.20, Numpy 1.8. GDAL 1.9.2.

Add masking option to shapes()

When one is interested only in foreground feature shapes, masking out the background saves time that would be used calculating background shapes.

Check path given to rasterio.open

Make sure it exists before handing it over to GDALOpen.

Expose SRS as WKT

Via read-only crs_wkt attribute.

Allow read_band's out array and write_band's src array shapes to vary from dataset

GDALRasterIO will resample as needed.

open(): add options for block size in 'w' mode

And surface them in the meta attribute.

Color tables for greyscale image bands

Use cases: optimal display of NDVI and elevation. Should expose GDAL's own color ramp function but emphasize use of custom color tables -- mappings from pixel values to RGB (or RGBA) tuples -- and external tools for generating them.

Starting point: http://www.gdal.org/gdal_8h.html#ade5abc76e229097e6799ab58414aeaed.

Usage something like:

with rasterio.open(filename, 'w', ...) as dst:
    dst.write_band(1, array)
    dst.write_colormap(1, {0: (0, 0, 0), 1: (1, 1, 1), ...})

Add a polygonizing function that works with Fiona

GDAL's GDALPolygonize function is more like a program's main() than a library function. It reads and writes. But it can support a Python function that has no I/O and that works with numpy rasters and Fiona collections. Something like this:

import rasterio

with rasterio.open(raster_filename, 'r') as src:
    image = src.read_band(1) # read

features = rasterio.features.polygonize(image) # no I/O

with fiona.open(vector_filename, 'w', **meta) as dst:
    dst.writerecords(features) # write

Rough design: an ndarray is made into a GDAL MEM format raster. Extracted features are written to an OGR Memory format dataset and a Python iterator over them is returned.

Enhance rasterio.insp

Add rasterio module into local namespace
Allow shortcuts to show bands (by dataset and bidx)
Add some stats or histogram functions

Copy rasterio.insp to rio_insp

It's a better name.

Use a context manager to manage drivers

Instead of registering drivers on a module import and de-registering them at exit, I'm going to implement a context manager that will handle this. The canonical way to use it is like this:

import rasterio

with rasterio.drivers():
    # do stuff

but since a lot of users are going to expect to call rasterio.open() and have it just work, the rasterio reader and writer can also manager drivers in their own context if there is no global one. A global GDAL driver manager is detected by head count of registered drivers.

import rasterio

with rasterio.open(...) as f:
    # read or write

The above can be slower than the canonical example because it registers/deregisters every time rasterio.open() is called.

Expose GDALSieveFilter()

Something like

sieved = rasterio.features.sieve(image, size=5)

GDALSieveFilter() wants bands, so I'll use in-memory rasters again.

Reference: http://www.gdal.org/gdal__alg_8h.html#a33309c0a316b223bd33ae5753cc7f616

Allow handling of NoData values

I think rasterio should allow the user to get hold of NoData values.

If it can be done, it would be great to also have some way of assisting the user in testing for NoData values in NumPy arrays. This could be great because the most obvious way of doing the test (using simple equality) will fail to work on datasets using NaN as NoData value. And unfortunately NaN NoData values are quite common out there.

Change number of bands from 0.. to 1..

Rasterio counts its bands starting with 0. Even though this goes against image processing conventions (GDAL's in particular), Python indexes start with 0, and I thought it would be right to follow suit. But it looks like a future point of pain.

Recently, I read that Guido's preference for 0 is largely to make slicing neat and appealing to programmers: https://plus.google.com/115212051037621986145/posts/YTUxbXYZyfi. A rasterio file's bands aren't sliceable at all, and aren't exposed as a list of objects in any way. The only access to them is by read_band(idx) and write_band(idx, data) methods and those indexes (alone) could begin with 1.

The new pain point would appear here (we're just moving the pain around):

for i in range(src.count):
    data = src.read_band(i)

Bugs would arise from attempts to access a non-existent band 0. To help, rasterio will need raster files to expose their band indexes as an attribute. Something like this:

for i in src.band_indexes:
    data = src.read_band(i)

BTW, the reason why rasterio has no Band class is to avoid dangling bands with references to closed data sources as can happen in GDAL. GDAL gives programmers an interface that suggests they are operating directly on bands as they are stored on disk, but all rasterio operations are on image arrays in Python and then written to disk.

Implementing r+ mode

Hi @sgillies, I want to express interest in a read/write mode. I noticed that it's partially implemented. Is there something blocking this feature?

Move repo to Mapbox organization

No change to license or anything. Any objection to this, @AsgerPetersen?

Add GDAL metadata support

See http://www.gdal.org/classGDALMajorObject.html and http://www.gdal.org/classGDALMajorObject.html#a83e2cebbfa60a5e3f5cb4aa2e9b80d12.

Idea: have get_metadata(domain="foo") return a dict, have set_metadata(key, value, domain="foo") set one item. I want to avoid a rw metadata object for the same reason I'm avoiding rw band objects: dangling references.

Allow windowed reads/writes

In some cases it may be optimal to just read/write smaller parts of the band instead of the entire band.

Implement copying of raster datasets

Since we can only 'w' GTiffs (see #4), we need function to convert temp GTiffs to other formats (using CreateCopy).

Directly support multiple inputs and outputs

By lack of a mailing-list or anything i'm just going to write an idea here as an issue, i hope that's alright.

The issue (https://github.com/sgillies/rasterio/issues/6) by AsgerPeterson about iterating over the blocks provides a nice example of how to open and write block for block.

Would it be within the scope of RasterIO to extent this to support multiple input and output datasets with a similar interface? It introduces a lot of potential problems like projection differences, performance hits if the blocks aren't the same etc.

A really simple implementation i made a year ago demonstrates this idea, see for example:
http://nbviewer.ipython.org/gist/RutgerK/7941518

My implementation is full of assumptions of the datasets being identical in terms of properties, making it just a very simple wrapper to eliminate the gdal.Open() overhead.

So to modify Asger's example a bit:

with rasterio.open('source1.tif',
                   'source2.tif') as src:

    kwargs = src.meta
    with rasterio.open(('destination1.tif', 'w', **kwargs),
                       ('destination2.tif', 'w', **kwargs),
                       ('destination3.tif', 'w', **kwargs)) as dst:

        for window in src.get_chunks( 1 ):
            data = src.read_band( 1, window)
            dst['destination1'].write_band( 1, data['source1'] + data['source2'], window)
            etc.

In some cases you could achieve a similar effect by combining inputs in a VRT, but this doesn't work for outputs for example.

rasterio / rasterio Goto Github PK

rasterio's Issues

Recommend Projects

Recommend Topics

Recommend Org