geofileops.identity#

geofileops.identity(input1_path: str | os.PathLike[Any], input2_path: str | os.PathLike[Any] | None, output_path: str | os.PathLike[Any], input1_layer: str | None = None, input1_columns: list[str] | None = None, input1_columns_prefix: str = 'l1_', input2_layer: str | None = None, input2_columns: list[str] | None = None, input2_columns_prefix: str = 'l2_', output_layer: str | None = None, include_duplicates: bool = True, explodecollections: bool = False, gridsize: float = 0.0, where_post: str | None = None, nb_parallel: int | None = None, batchsize: int = -1, subdivide_coords: int = 2000, force: bool = False) → None#

Calculates the pairwise identity of the two input layers.

The result is the equivalent of the intersection between the two layers + layer 1 differenced with layer 2.

If input2_path is None, a self-identity is performed. This means the 1st input layer is used for both inputs but interactions between the same rows in this layer are ignored. The output can be influenced via the include_duplicates parameter:

If True (the default), the logic explained above is applied as-such. The result is that each (part of a) geometry that has an intersection is duplicated in the output with the attribute column values “switched”. Hence, each intersecting pair of geometries A and B will lead to two rows in the output: one row with the attributes of A in the columns with input1_columns_prefix and the attributes of B in the columns with input2_columns_prefix, and a second row with the column values saved the other way around. Non-intersecting areas will not lead to duplicates in identity.

If False, only one of the duplicates is kept in the output with the column values only available “in one direction”.

Remarks:

The result will contain the attribute columns from both input layers. The attribute values wont’t be changed, so columns like area,… will have to be recalculated manually if this is wanted.

To speed up processing, complex input geometries are subdivided by default. For these geometries, the output geometries will contain extra collinear points where the subdividing occured. This behaviour can be controlled via the subdivide_coords parameter.

Starting from geofileops 0.11.0, sliver polygons are removed from the output by default. Polygons are considered slivers if they are narrower than a certain tolerance. By default this tolerance is 0.001 CRS units if the CRS of the input layers is a projected CRS, 1e-7 if it is a geographic CRS. More information + information how to change this default tolerance can be found here: options.set_sliver_tolerance.

Parameters:

input1_path (PathLike) – the 1st input file.
input2_path (PathLike, optional) – the 2nd input file. If None, the 1st input layer is used for both inputs but interactions between the same rows in this layer will be ignored.
output_path (PathLike) – the file to write the result to
input1_layer (str, optional) – 1st input layer name. If None, input1_path should contain only one layer. Defaults to None.
input1_columns (List[str], optional) – list of columns to retain. If None, all standard columns are retained. In addition to standard columns, it is also possible to specify “fid”, a unique index available in all input files. Note that the “fid” will be aliased even if input1_columns_prefix is “”, eg. to “fid_1”. Defaults to None.
input1_columns_prefix (str, optional) – prefix to use in the column aliases. Defaults to “l1_”.
input2_layer (str, optional) – 2nd input layer name. If None, input2_path should contain only one layer. Defaults to None.
input2_columns (List[str], optional) – columns to select. If None is specified, all columns are selected. As explained for input1_columns, it is also possible to specify “fid”. Defaults to None.
input2_columns_prefix (str, optional) – prefix to use in the column aliases. Defaults to “l2_”.
output_layer (str, optional) – output layer name. If None, the output_path stem is used. Defaults to None.
include_duplicates (bool, optional) –
only applicable for a union on a single layer (input2_path=None). True to include duplicate geometries resulting from the pairwise identity in the output, which leads to each intersection being duplicated with the attribute column values “switched”. Defaults to True.

Added in version 0.11.0.
explodecollections (bool, optional) – True to convert all multi-geometries to singular ones after the dissolve. Defaults to False.
gridsize (float, optional) – the size of the grid the coordinates of the ouput will be rounded to. Eg. 0.001 to keep 3 decimals. Value 0.0 doesn’t change the precision. Defaults to 0.0.
where_post (str, optional) – SQL filter to apply after all other processing, including e.g. explodecollections. It should be in sqlite syntax and spatialite reference functions can be used. Defaults to None.
nb_parallel (int | None, optional) – the number of parallel workers to use. If None, the preference set in the nb_parallel configuration option is used, which defaults to the number of CPU cores available. For more information, see options.set_nb_parallel(). Defaults to None.
batchsize (int, optional) – indicative number of rows to process per batch. A smaller batch size, possibly in combination with a smaller nb_parallel, will reduce the memory usage. Defaults to -1: (try to) determine optimal size automatically.
subdivide_coords (int, optional) – the input geometries will be subdivided to parts with about subdivide_coords coordinates during processing which can offer a large speed up for complex geometries. Subdividing can result in extra collinear points being added to the boundaries of the output. If 0, no subdividing is applied. Defaults to 2000.
force (bool, optional) – overwrite existing output file(s). Defaults to False.