Giter Club home page Giter Club logo

rasterio's Introduction

Rasterio

Rasterio reads and writes geospatial raster data.

image

image

image

image

Geographic information systems use GeoTIFF and other formats to organize and store gridded, or raster, datasets. Rasterio reads and writes these formats and provides a Python API based on N-D arrays.

Rasterio 1.4 works with Python 3.9+, Numpy 1.21+, and GDAL 3.3+. Official binary packages for Linux, macOS, and Windows with most built-in format drivers plus HDF5, netCDF, and OpenJPEG2000 are available on PyPI.

Read the documentation for more details: https://rasterio.readthedocs.io/.

Example

Here's an example of some basic features that Rasterio provides. Three bands are read from an image and averaged to produce something like a panchromatic band. This new band is then written to a new single band TIFF.

The output:

image

API Overview

Rasterio gives access to properties of a geospatial raster file.

A rasterio dataset also provides methods for getting read/write windows (like extended array slices) given georeferenced coordinates.

Rasterio CLI

Rasterio's command line interface, named "rio", is documented at cli.rst. Its rio insp command opens the hood of any raster dataset so you can poke around using Python.

$ rio insp tests/data/RGB.byte.tif
Rasterio 0.10 Interactive Inspector (Python 3.4.1)
Type "src.meta", "src.read(1)", or "help(src)" for more information.
>>> src.name
'tests/data/RGB.byte.tif'
>>> src.closed
False
>>> src.shape
(718, 791)
>>> src.crs
{'init': 'epsg:32618'}
>>> b, g, r = src.read()
>>> b
masked_array(data =
 [[-- -- -- ..., -- -- --]
 [-- -- -- ..., -- -- --]
 [-- -- -- ..., -- -- --]
 ...,
 [-- -- -- ..., -- -- --]
 [-- -- -- ..., -- -- --]
 [-- -- -- ..., -- -- --]],
             mask =
 [[ True  True  True ...,  True  True  True]
 [ True  True  True ...,  True  True  True]
 [ True  True  True ...,  True  True  True]
 ...,
 [ True  True  True ...,  True  True  True]
 [ True  True  True ...,  True  True  True]
 [ True  True  True ...,  True  True  True]],
       fill_value = 0)

>>> np.nanmin(b), np.nanmax(b), np.nanmean(b)
(0, 255, 29.94772668847656)

Rio Plugins

Rio provides the ability to create subcommands using plugins. See cli.rst for more information on building plugins.

See the plugin registry for a list of available plugins.

Installation

See docs/installation.rst

Support

The primary forum for questions about installation and usage of Rasterio is https://rasterio.groups.io/g/main. The authors and other users will answer questions when they have expertise to share and time to explain. Please take the time to craft a clear question and be patient about responses.

Please do not bring these questions to Rasterio's issue tracker, which we want to reserve for bug reports and other actionable issues.

Development and Testing

See CONTRIBUTING.rst.

Documentation

See docs/.

License

See LICENSE.txt.

Authors

The rasterio project was begun at Mapbox and was transferred to the rasterio Github organization in October 2021.

See AUTHORS.txt.

Changes

See CHANGES.txt.

Who is Using Rasterio?

See here.

rasterio's People

Contributors

adamjstewart avatar asgerpetersen avatar brendan-ward avatar cgohlke avatar colintalbert avatar dnomadb avatar drnextgis avatar eseglem avatar geowurster avatar glostis avatar groutr avatar idanmiara avatar j08lue avatar jameshiebert avatar jdmcbr avatar kapadia avatar kirill888 avatar kjordahl avatar mlinds avatar mojodna avatar mwtoews avatar perrygeo avatar rouault avatar sebastic avatar sgillies avatar simgislab avatar snorfalorpagus avatar snowman2 avatar underchemist avatar vincentsarago avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rasterio's Issues

Directly support multiple inputs and outputs

By lack of a mailing-list or anything i'm just going to write an idea here as an issue, i hope that's alright.

The issue (https://github.com/sgillies/rasterio/issues/6) by AsgerPeterson about iterating over the blocks provides a nice example of how to open and write block for block.

Would it be within the scope of RasterIO to extent this to support multiple input and output datasets with a similar interface? It introduces a lot of potential problems like projection differences, performance hits if the blocks aren't the same etc.

A really simple implementation i made a year ago demonstrates this idea, see for example:
http://nbviewer.ipython.org/gist/RutgerK/7941518

My implementation is full of assumptions of the datasets being identical in terms of properties, making it just a very simple wrapper to eliminate the gdal.Open() overhead.

So to modify Asger's example a bit:

with rasterio.open('source1.tif',
                   'source2.tif') as src:

    kwargs = src.meta
    with rasterio.open(('destination1.tif', 'w', **kwargs),
                       ('destination2.tif', 'w', **kwargs),
                       ('destination3.tif', 'w', **kwargs)) as dst:

        for window in src.get_chunks( 1 ):
            data = src.read_band( 1, window)
            dst['destination1'].write_band( 1, data['source1'] + data['source2'], window)
            etc.

In some cases you could achieve a similar effect by combining inputs in a VRT, but this doesn't work for outputs for example.

Driver loading and nested withs

On your blog you describe using with for driver management. What happens in the following case?

with rasterio.drivers():
    with rasterio.drivers():
        pass
    with rasterio.open('rasterio/tests/data/RGB.byte.tif') as f:
        pass

Here, the code is obviously 'wrong', but withs are designed to be nested.

Maybe add a reference count to DriverManager?

Remove six dependency

Currently there is a very limited used of the six module dependency.

In rasterio/__init__.py:

from six import string_types

Looking at the source code, this function from the module can be replaced by:

import sys
if sys.version_info[0] >= 3:
    string_types = str,
else:
    string_types = basestring,

Or, an alternative solution is to define:

if sys.version_info[0] >= 3:
    basestring = str

and replace instances of string_types with basestring.

Add an image warping function

GDAL's low-level warp kernel supports warping between image buffers. For rasterio, this means that we can have an interface for warping directly from one Numpy array (or object that we can get a Cython memoryview on) to another backed by existing functions in the GDAL lib.

Such a function exists in scipy.ndimage: http://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.interpolation.geometric_transform.html and it looks like it's mainly in C but one in rasterio could avoid a dependency on Scipy for a process that programmers are well used to getting from GDAL.

I'm thinking a similar interface would be good, something like

def rasterio.warp(input, output_shape=None, output=None, ...):
    ...

which is what rasterio's read_band and write_band follow.

References:

cc @hobu

Add shortcut to image bands

In a nutshell: let methods accept (dataset, bidx) tuples as well as ndarrays.

This would let the polygonization example

    with rasterio.drivers():
        with rasterio.open(raster_file) as src:
            image = src.read_band(1)
        results = (
            {'properties': {'raster_val': v}, 'geometry': s}
            for i, (s, v) 
            in enumerate(
                shapes(image, transform=src.transform)))

be condensed to

    with rasterio.drivers():
        with rasterio.open(raster_file) as src:
            results = (
                {'properties': {'raster_val': v}, 'geometry': s}
                for i, (s, v) 
                in enumerate(
                    shapes((src, 1), transform=src.transform)))

This cuts out 2 copies: source band to ndarray in read_band() and ndarray to GDAL in-memory dataset within shapes().

Color tables for greyscale image bands

Use cases: optimal display of NDVI and elevation. Should expose GDAL's own color ramp function but emphasize use of custom color tables -- mappings from pixel values to RGB (or RGBA) tuples -- and external tools for generating them.

Starting point: http://www.gdal.org/gdal_8h.html#ade5abc76e229097e6799ab58414aeaed.

Usage something like:

with rasterio.open(filename, 'w', ...) as dst:
    dst.write_band(1, array)
    dst.write_colormap(1, {0: (0, 0, 0), 1: (1, 1, 1), ...})

Better meta data handling in basic features examples

The current rasterio basic features example expects an proper GeoTiff file as input. The example could be easily used with e.g. JPEG files by changing:

    kwargs = src.meta
    kwargs.update(
        dtype=rasterio.uint8,
        count=1,
        compress='lzw')

to something like:

    try:
        kwargs = src.meta
    except ValueError:
        kwargs ={'driver': 'GTiff', 
                 'height': src.height,
                 'width': src.width }
    kwargs.update(
        dtype=rasterio.uint8,
        count=1,
        compress='lzw')

Clip/Mask Functions

Sean,

First of all. Congrats! Second of all, I could have been using this library over a year ago! :) sigh You create wonderful Geo libraries for Python!

This Issue is mostly to remind myself (and you might be interested to) to contribute the clipping/masking functions of the CCAV Project for which I've been working on for the past year. See: https://bitbucket.org/ccaih/ccav and specifically https://bitbucket.org/ccaih/ccav/src/tip/src/ccav/utils.py

My intention is to contribute these functions to rasterio in some way as I believe they are meaningful and useful to this library and quite "fitting".

cheers
James

0.7 release

TODO:

  • Move _warp.reproject to warp.reproject.
  • Document above
  • Document optional kwargs of rasterio.drivers()
  • Tag
  • Upload to PyPI.

Add a polygonizing function that works with Fiona

GDAL's GDALPolygonize function is more like a program's main() than a library function. It reads and writes. But it can support a Python function that has no I/O and that works with numpy rasters and Fiona collections. Something like this:

import rasterio

with rasterio.open(raster_filename, 'r') as src:
    image = src.read_band(1) # read

features = rasterio.features.polygonize(image) # no I/O

with fiona.open(vector_filename, 'w', **meta) as dst:
    dst.writerecords(features) # write

Rough design: an ndarray is made into a GDAL MEM format raster. Extracted features are written to an OGR Memory format dataset and a Python iterator over them is returned.

Write a switching from osgeo.gdal doc

Nothing fancy, just an description of what rasterio offers re the GDAL data model and how to go from C style programming with osgeo.gdal to Python style with rasterio.

Iterate over blocks

It would be a very cool feature if rasterio could transparently handle iteration over blocks. Almost all my project involves raster files which are too large to load into memory.

It could maybe be something like:

with rasterio.open('input.tif') as src:
  kwargs = src.meta
  with rasterio.open('output.tif', 'w', **kwargs) as dst:
    # Iterate over raster blocks from band 1
    for block, data in src.band_blocks(1):
      # Block holds xoffset, yoffset, width, height
      data = data + 1
      dst.write_band(1, data, block)

Add masking option to shapes()

When one is interested only in foreground feature shapes, masking out the background saves time that would be used calculating background shapes.

Harmonize indexing throughout the API

When working with NumPy dimensions are in y,x order but rasterio methods expect pixel coordinates in x,y order.

Imho this an unnecessary complication.

Allow handling of NoData values

I think rasterio should allow the user to get hold of NoData values.

If it can be done, it would be great to also have some way of assisting the user in testing for NoData values in NumPy arrays. This could be great because the most obvious way of doing the test (using simple equality) will fail to work on datasets using NaN as NoData value. And unfortunately NaN NoData values are quite common out there.

Allow windowed reads/writes

In some cases it may be optimal to just read/write smaller parts of the band instead of the entire band.

See also #6

Use a context manager to manage drivers

Instead of registering drivers on a module import and de-registering them at exit, I'm going to implement a context manager that will handle this. The canonical way to use it is like this:

import rasterio

with rasterio.drivers():
    # do stuff

but since a lot of users are going to expect to call rasterio.open() and have it just work, the rasterio reader and writer can also manager drivers in their own context if there is no global one. A global GDAL driver manager is detected by head count of registered drivers.

import rasterio

with rasterio.open(...) as f:
    # read or write

The above can be slower than the canonical example because it registers/deregisters every time rasterio.open() is called.

Travis config

Pythons 2.7 and 3.3. Cython 0.20, Numpy 1.8. GDAL 1.9.2.

Enhance rasterio.insp

  • Add rasterio module into local namespace
  • Allow shortcuts to show bands (by dataset and bidx)
  • Add some stats or histogram functions

Implementing r+ mode

Hi @sgillies, I want to express interest in a read/write mode. I noticed that it's partially implemented. Is there something blocking this feature?

v0.6 not compatible with Python 2.6: dict comprehension

When using rasterio with Python 2.6, I get the following error:

$ /usr/local/bin/python -c "import rasterio"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.6/site-packages/rasterio/__init__.py", line 8, in <module>
    from rasterio._io import RasterReader, RasterUpdater
  File "_io.pyx", line 12, in init rasterio._io (rasterio/_io.c:20093)
  File "/usr/local/lib/python2.6/site-packages/rasterio/dtypes.py", line 26
    dtype_rev = {v: k for k, v in dtype_fwd.items()}
                        ^
SyntaxError: invalid syntax

A solution could be to change this:

dtype_rev = {v: k for k, v in dtype_fwd.items()}

to:

dtype_rev = dict([(v, k) for k, v in dtype_fwd.items()])

Feature Request: Create Numpy Mask from Geometry

We would like to be able to create a numpy mask from a shapely geometry.

Here is an example that is roughly similar in intent to approach referenced in Issue #1 but different in implementation in that it uses gdal.RasterizeLayer instead of PIL ImageDraw (benchmarks comparing the two have not yet been performed):

def mask_from_geometry(ndarray_shape, geometry, projection_wkt, transform, all_touched=False):
    """
    Create a boolean numpy mask from a Shapely geometry.  Data must be projected to match prior to calling this function.
    Areas are coded as 1 inside the geometry, and 0 outside.  Invert this to use as a numpy.ma mask.

    :param ndarray_shape: (rows, cols)
    :param geometry: Shapely geometry object
    :param projection_wkt: the projection of the geometry and target numpy array, as WKT
    :param transform: the GDAL transform object representing the spatial domain of the target numpy array
    :param all_touched: if true, any pixel that touches geometry will be included in mask.
    If false, only those with centroids within or selected by Brezenhams line algorithm will be included.
    See http://www.gdal.org/gdal__alg_8h.html#adfe5e5d287d6c184aab03acbfa567cb1 for more information.
    """

    assert len(ndarray_shape) == 2

    sr = osr.SpatialReference()
    sr.ImportFromWkt(projection_wkt)
    target_ds = gdal.GetDriverByName("MEM").Create("", ndarray_shape[1], ndarray_shape[0], gdal.GDT_Byte)
    target_ds.SetProjection(sr.ExportToWkt())
    target_ds.SetGeoTransform(transform)
    temp_features = ogr.GetDriverByName("Memory").CreateDataSource("")
    lyr = temp_features.CreateLayer("poly", srs=sr)
    feature = ogr.Feature(lyr.GetLayerDefn())
    feature.SetGeometryDirectly(ogr.Geometry(wkb = geometry.wkb))
    lyr.CreateFeature(feature)
    kwargs = {}
    if all_touched:
        kwargs['options'] = ["ALL_TOUCHED=TRUE"]
    gdal.RasterizeLayer(target_ds, [1], lyr, burn_values=[1], **kwargs)
    return target_ds.GetRasterBand(1).ReadAsArray(0, 0, ndarray_shape[1], ndarray_shape[0]).astype(numpy.bool)

We would like to port this to Cython and create a pull request, but first wanted to agree on the parameters for this function. Having to pass in projection_wkt and transform parameters here feels heavy, but I cannot get GDAL to work without them here.

I see that you are not always passing transform in for _features.pyx, and see no reference to setting projection. Are you able to get by without these if you are calling GDAL from Cython instead?

Change number of bands from 0.. to 1..

Rasterio counts its bands starting with 0. Even though this goes against image processing conventions (GDAL's in particular), Python indexes start with 0, and I thought it would be right to follow suit. But it looks like a future point of pain.

Recently, I read that Guido's preference for 0 is largely to make slicing neat and appealing to programmers: https://plus.google.com/115212051037621986145/posts/YTUxbXYZyfi. A rasterio file's bands aren't sliceable at all, and aren't exposed as a list of objects in any way. The only access to them is by read_band(idx) and write_band(idx, data) methods and those indexes (alone) could begin with 1.

The new pain point would appear here (we're just moving the pain around):

for i in range(src.count):
    data = src.read_band(i)

Bugs would arise from attempts to access a non-existent band 0. To help, rasterio will need raster files to expose their band indexes as an attribute. Something like this:

for i in src.band_indexes:
    data = src.read_band(i)

BTW, the reason why rasterio has no Band class is to avoid dangling bands with references to closed data sources as can happen in GDAL. GDAL gives programmers an interface that suggests they are operating directly on bands as they are stored on disk, but all rasterio operations are on image arrays in Python and then written to disk.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.