Comments (7)
I think pyogrio doesn't handle schemas, so I haven't tried.
Pyogrio doesn't support the schema
keyword (as that is a fiona specific parameter), but it certainly does support writing the different data types. But because the input for pyogrio is a geopandas DataFrame or numpy arrays, the data already has a schema, and pyogrio uses that (instead of letting the user specify it separately).
So if you ensure that your input data has an int16
column, pyogrio should pass that information through to GDAL:
import geopandas
from shapely.geometry import Point
gdf = geopandas.GeoDataFrame({"col": np.array([1, 2, 3], dtype="int16"), "geometry": [Point(i, i) for i in range(3)]})
gdf
# col geometry
# 0 1 POINT (0 0)
# 1 2 POINT (1 1)
# 2 3 POINT (2 2)
gdf.to_file("test_gdb.gdb", driver="OpenFileGDB", engine="pyogrio")
I am not fully sure how then to check independently whether it has actually written the correct data type to the OpenFileGDB, but ogrinfo
indicates that it has:
$ ogrinfo test_gdb.gdb test_gdb
INFO: Open of `test_gdb.gdb'
using driver `OpenFileGDB' successful.
Layer name: test_gdb
Geometry: Point
Feature Count: 3
Extent: (0.000000, 0.000000) - (2.000000, 2.000000)
Layer SRS WKT:
(unknown)
FID Column = OBJECTID
Geometry Column = SHAPE
col: Integer(Int16) (0.0)
OGRFeature(test_gdb):1
col (Integer(Int16)) = 1
POINT (0 0)
OGRFeature(test_gdb):2
col (Integer(Int16)) = 2
POINT (1 1)
OGRFeature(test_gdb):3
col (Integer(Int16)) = 3
POINT (2 2)
from fiona.
And if you want to control the exact OpenFIleGDB types being used by GDAL, it seems to have a creation option COLUMN_TYPES
that can be passed (see https://gdal.org/drivers/vector/openfilegdb.html#layer-creation-options, but didn't try this)
@sgillies OGR indeed only uses int32 or int64 data in its internal data model, but there is the concept of "sub type" to annotate a type with additional information (I assume it doesn't change how the data is represented internally, still int32, but then it is used as a hint when writing): https://gdal.org/api/vector_c_api.html#_CPPv415OGRFieldSubType, and there is has a OFSTInt16
.
Pyogrio uses this when the input data has a bitwidth < 32, and based on the example above, it seems to have effect. Fiona could use this as well. It's already declared:
Lines 102 to 105 in 195579d
and could set it like is already done for bool subtype as well:
Lines 1293 to 1295 in 195579d
from fiona.
@sgillies thanks for taking this seriously!
- Sadly, I honestly don't know the specifics on this. My only will was to integrate shapefiles into ESRI's GeoDataBases (don't blame me I am forced to 😬) without using
arcpy
...
My knowledge on this only comes from the answer I shared in the initial issue 😞 - Whatever works for my usecase would be fine for me 😉
from fiona.
@remi-braun Happy new year! OGR doesn't have a short integer type, only 32 and 64-bit integers. Neither does Fiona at this time, thus your layers are being constructed with 32 bit wide integer fields. I don't think there is any logic in the GDB driver to reduce the width at creation time.
Do you see different behavior if you use ogr2ogr or pyogrio?
from fiona.
Happy new year to you too 😉
I think pyogrio doesn't handle schemas, so I haven't tried.
What's weird is that for ESRI a short isn't an int16
but also an int32
, but I don't exactly know what means the end of int32:4
.
Note that text:255
works for GDB, so the :
mechanism is in some way already handled in OpenFileGDB.
And we made it all work for Shapefiles, so other drivers handle this mechanism.
from fiona.
With pyogrio
, I successfully wrote short
dtypes!
However, geopandas doesn't read corretly the input type, so I had to change the type of every column (which could be time consuming):
import geopandas as gpd
gdb_path = "my_gdb.gdb"
layer = "B1_observed_event_a"
# Read layer
observed_event = gpd.read_file(gdb_path , layer=layer)
# Set correct types
observed_event.event_type = observed_event.event_type.astype("int32")
observed_event.obj_desc = observed_event.obj_desc.astype("int16")
observed_event.notation = observed_event.notation.astype("str") # How can I set str:255 ?
observed_event.det_method = observed_event.det_method.astype("int16")
observed_event.dmg_src_id = observed_event.dmg_src_id.astype("int32")
# Write back in gdb
observed_event.to_file("my_gdb_copy.gdb", layer=layer, driver="OpenFileGDB", engine="pyogrio")
However, with fiona's schema, I succeeded to set str:255
as field, but not with pyogrio
. How can I do that ?
PS: the goal of all this is to allow the GDB domains to be recognized automatically, but I don't know if it will work even with the correct types
from fiona.
@remi-braun I've begun working on this and have 2 questions.
- Is not the field width specifier
4
inint32:4
specific to Shapefiles? I'd love to not have to think about this anymore if we don't have to. It's not clear to me that OGR will coerce a 4 char wide OFTInt to OFTInt16 when saving. - What would you think about an "int16" type as in #1358? If your GPKG file has a "short" field, Fiona would report "int16", and if you write a new schema from Fiona with "int16" type, it should manifest in GPKG as a "short".
from fiona.
Related Issues (20)
- Bring over Rasterio's python opener
- muslinux wheel missing HOT 4
- 1.10 release HOT 7
- Drop support for Python 3.7 and 3.8 HOT 5
- Support for PMTile?
- Remove dependency on Numpy for VSI read buffer HOT 1
- Bring back Python 3.8 for 1.10
- A new OGRGeometryH based Geometry class
- 1.9.6 release HOT 1
- Rename master branch to main
- Add name keyword argument to fiona.open() and Collection constructor HOT 2
- Address high priority items on OpenSSF scorecard report
- 1.9.6: build fails with `-Wincompatible-pointer-types` HOT 2
- Change python opener VSI plugin prefix to /vsifiopener
- 1.9.6: pytest fails in 47 units HOT 3
- 1.10 Windows wheels?
- Fiona no longer recognizes pd.Timestamp (as datetime.datetime) since 1.10 a2 HOT 3
- Make features printable again HOT 2
- Error when importing fiona on an M1 Mac. HOT 3
- Pip package fiona-1.9.6.tar.gz has inconsistent version: filename has '1.9.6', but metadata has '0.0.0' HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fiona.