geofileops.join#

geofileops.join(input1_path: Path, input2_path: Path, output_path: Path, input1_on: list[str] | str, input2_on: list[str] | str, join_type: str = 'INNER', input1_layer: str | None = None, input1_columns: list[str] | None = None, input1_columns_prefix: str = 'l1_', input2_layer: str | None = None, input2_columns: list[str] | None = None, input2_columns_prefix: str = 'l2_', output_layer: str | None = None, explodecollections: bool = False, gridsize: float = 0.0, where_post: str | None = None, nb_parallel: int | None = 1, batchsize: int = -1, force: bool = False) None#

Joins two layers based on attribute values.

The output will contain the geometries of input1. The input1_on and input2_on parameters will determine which geometries of input1 will be matched with input2.

Alternative names:
  • Pandas: merge, join

Added in version 0.11.0.

Parameters:
  • input1_path (PathLike) – the 1st input file

  • input2_path (PathLike) – the 2nd input file

  • output_path (PathLike) – the file to write the result to

  • input1_on (List[str] or str) – column(s) in the 1st input layer to join on.

  • input2_on (List[str] or str) – column(s) in the 2nd input layer to join on.

  • join_type (str, optional) – type of join: “INNER” or “LEFT”. Defaults to “INNER”.

  • input1_layer (str or LayerInfo, optional) – 1st input layer name. If None, input1_path should contain only one layer. Defaults to None.

  • input1_columns (List[str], optional) – list of columns to retain. If None, all standard columns are retained. In addition to standard columns, it is also possible to specify “fid”, a unique index available in all input files. Note that the “fid” will be aliased even if input1_columns_prefix is “”, eg. to “fid_1”. Defaults to None.

  • input1_columns_prefix (str, optional) – prefix to use in the column aliases. Defaults to “l1_”.

  • input2_layer (str or LayerInfo, optional) – 2nd input layer name. If None, input2_path should contain only one layer. Defaults to None.

  • input2_columns (List[str], optional) – columns to select. If None is specified, all columns are selected. As explained for input1_columns, it is also possible to specify “fid”. Defaults to None.

  • input2_columns_prefix (str, optional) – prefix to use in the column aliases. Defaults to “l2_”.

  • output_layer (str, optional) – output layer name. If None, the output_path stem is used. Defaults to None.

  • explodecollections (bool, optional) – True to convert all multi-geometries to single geometries. Defaults to False.

  • gridsize (float, optional) – the size of the grid the coordinates of the ouput will be rounded to. Eg. 0.001 to keep 3 decimals. Value 0.0 doesn’t change the precision. Defaults to 0.0.

  • where_post (str, optional) – SQL filter to apply after all other processing, including e.g. explodecollections. It should be in sqlite syntax and spatialite reference functions can be used. Defaults to None.

  • nb_parallel (int | None, optional) – the number of parallel workers to use. If None, the preference set in the nb_parallel configuration option is used, which defaults to the number of CPU cores available. For more information, see options.set_nb_parallel(). Defaults to 1.

  • batchsize (int, optional) – indicative number of rows to process per batch. A smaller batch size, possibly in combination with a smaller nb_parallel, will reduce the memory usage. Defaults to -1: (try to) determine optimal size automatically.

  • force (bool, optional) – overwrite existing output file(s). Defaults to False.

See also