geofileops.read_file#

geofileops.read_file(path: Union[str, PathLike[Any]], layer: Optional[str] = None, columns: Optional[Iterable[str]] = None, bbox=None, rows=None, where: Optional[str] = None, sql_stmt: Optional[str] = None, sql_dialect: Optional[Literal['SQLITE', 'OGRSQL']] = None, ignore_geometry: bool = False, fid_as_index: bool = False) GeoDataFrame#

Reads a file to a geopandas GeoDataframe.

The file format is detected based on the filepath extension.

If sql_stmt is specified, the sqlite query can contain following placeholders that will be automatically replaced for you:

  • {geometrycolumn}: the column where the primary geometry is stored.

  • {columns_to_select_str}: if columns is not None, those columns, otherwise all columns of the layer.

  • {input_layer}: the layer name of the input layer.

Example SQL statement with placeholders:

SELECT {geometrycolumn}
      {columns_to_select_str}
  FROM "{input_layer}" layer

The underlying library used to read the file can be choosen using the “GFO_IO_ENGINE” environment variable. Possible values are “fiona” and “pyogrio”. This option is created as a temporary fallback to “fiona” for cases where “pyogrio” gives issues, so please report issues if they are encountered. In the future support for the “fiona” engine most likely will be removed. Default engine is “pyogrio”.

Parameters:
  • path (file path) – path to the file to read from

  • layer (str, optional) – The layer to read. If None and there is only one layer in the file it is read, otherwise an error is thrown. Defaults to None.

  • columns (Iterable[str], optional) – The (non-geometry) columns to read will be returned in the order specified. If None, all standard columns are read. In addition to standard columns, it is also possible to specify “fid”, a unique index available in all input files. Note that the “fid” will be aliased eg. to “fid_1”. Defaults to None.

  • bbox (Tuple, optional) – return only geometries intersecting this bbox. Defaults to None, then all rows are read.

  • rows (slice, optional) – return only the rows specified. For many file formats (e.g. Geopackage) this is slow, so using e.g. a where filter instead is recommended. Defaults to None, then all rows are returned.

  • where (str, optional) – where clause to filter features in layer by attribute values. If the datasource natively supports sql, its specific SQL dialect should be used (eg. SQLite and GeoPackage: `SQLITE`_, PostgreSQL). If it doesn’t, the `OGRSQL WHERE`_ syntax should be used. Note that it is not possible to overrule the SQL dialect, this is only possible when you use the SQL parameter. Examples: "ISO_A3 = 'CAN'", "POP_EST > 10000000 AND POP_EST < 100000000". Defaults to None.

  • sql_stmt (str) – SQL statement to use. Only supported with “pyogrio” engine.

  • sql_dialect (str, optional) – SQL dialect used. Options are None, “SQLITE” or “OGRSQL”. If None, for data sources with explicit SQL support the statement is processed by the default SQL engine (e.g. for Geopackage and Spatialite this is “SQLITE”). For data sources without native SQL support (e.g. .shp), the “OGRSQL” dialect is the default. If the “SQLITE” dialect is specified, spatialite reference functions can also be used. Defaults to None.

  • ignore_geometry (bool, optional) – True not to read/return the geometry. Defaults to False.

  • fid_as_index (bool, optional) – If True, will use the FIDs of the features that were read as the index of the GeoDataFrame. May start at 0 or 1 depending on the driver. Defaults to False.

Raises:

ValueError – an invalid parameter value was passed.

Returns:

the data read.

Return type:

gpd.GeoDataFrame