geofileops.join#
- geofileops.join(input1_path: Path, input2_path: Path, output_path: Path, input1_on: list[str] | str, input2_on: list[str] | str, join_type: str = 'INNER', input1_layer: str | None = None, input1_columns: list[str] | None = None, input1_columns_prefix: str = 'l1_', input2_layer: str | None = None, input2_columns: list[str] | None = None, input2_columns_prefix: str = 'l2_', output_layer: str | None = None, explodecollections: bool = False, gridsize: float = 0.0, where_post: str | None = None, nb_parallel: int | None = 1, batchsize: int = -1, force: bool = False) None#
Joins two layers based on attribute values.
The output will contain the geometries of input1. The
input1_onandinput2_onparameters will determine which geometries of input1 will be matched with input2.- Alternative names:
Pandas: merge, join
Added in version 0.11.0.
- Parameters:
input1_path (PathLike) – the 1st input file
input2_path (PathLike) – the 2nd input file
output_path (PathLike) – the file to write the result to
input1_on (List[str] or str) – column(s) in the 1st input layer to join on.
input2_on (List[str] or str) – column(s) in the 2nd input layer to join on.
join_type (str, optional) – type of join: “INNER” or “LEFT”. Defaults to “INNER”.
input1_layer (str or LayerInfo, optional) – 1st input layer name. If None,
input1_pathshould contain only one layer. Defaults to None.input1_columns (List[str], optional) – list of columns to retain. If None, all standard columns are retained. In addition to standard columns, it is also possible to specify “fid”, a unique index available in all input files. Note that the “fid” will be aliased even if
input1_columns_prefixis “”, eg. to “fid_1”. Defaults to None.input1_columns_prefix (str, optional) – prefix to use in the column aliases. Defaults to “l1_”.
input2_layer (str or LayerInfo, optional) – 2nd input layer name. If None,
input2_pathshould contain only one layer. Defaults to None.input2_columns (List[str], optional) – columns to select. If None is specified, all columns are selected. As explained for
input1_columns, it is also possible to specify “fid”. Defaults to None.input2_columns_prefix (str, optional) – prefix to use in the column aliases. Defaults to “l2_”.
output_layer (str, optional) – output layer name. If None, the
output_pathstem is used. Defaults to None.explodecollections (bool, optional) – True to convert all multi-geometries to single geometries. Defaults to False.
gridsize (float, optional) – the size of the grid the coordinates of the ouput will be rounded to. Eg. 0.001 to keep 3 decimals. Value 0.0 doesn’t change the precision. Defaults to 0.0.
where_post (str, optional) – SQL filter to apply after all other processing, including e.g.
explodecollections. It should be in sqlite syntax and spatialite reference functions can be used. Defaults to None.nb_parallel (int | None, optional) – the number of parallel workers to use. If None, the preference set in the nb_parallel configuration option is used, which defaults to the number of CPU cores available. For more information, see
options.set_nb_parallel(). Defaults to 1.batchsize (int, optional) – indicative number of rows to process per batch. A smaller batch size, possibly in combination with a smaller
nb_parallel, will reduce the memory usage. Defaults to -1: (try to) determine optimal size automatically.force (bool, optional) – overwrite existing output file(s). Defaults to False.
See also
join_by_location(): join two layers based on their spatial relationship