geofileops.join_nearest#

geofileops.join_nearest(input1_path: Union[str, os.PathLike[Any]], input2_path: Union[str, os.PathLike[Any]], output_path: Union[str, os.PathLike[Any]], nb_nearest: int, distance: Optional[float] = None, expand: Optional[bool] = None, input1_layer: Optional[str] = None, input1_columns: Optional[List[str]] = None, input1_columns_prefix: str = 'l1_', input2_layer: Optional[str] = None, input2_columns: Optional[List[str]] = None, input2_columns_prefix: str = 'l2_', output_layer: Optional[str] = None, nb_parallel: int = -1, batchsize: int = -1, force: bool = False)#

Joins features of input1 with the nb_nearest closest features of input2.

In addition to the columns requested via the input*_columns parameters, the following columns will be in the output file as well:

  • pos (int): relative rank (sorted by distance): the closest item will be #1, the second closest item will be #2 and so on.

  • distance (float): if the dataset is in a planar (= projected) crs, distance will be in the unit defined by the projection (meters, feet, chains etc.). For a geographic dataset (longitude and latitude degrees), distance will be in meters, with the most precise geodetic formulas being applied.

  • distance_crs (float): if the dataset is in a planar (= projected) crs, distance_crs will be in the unit defined by the projection (meters, feet, chains etc.). For a geographic dataset (longitude and latitude degrees), distance_crs will be in angles. Only available with spatialite >= 5.1.

Note: if spatialite version >= 5.1 is used, parameters distance and expand are mandatory.

Parameters:
  • input1_path (PathLike) – the input file to join to nb_nearest features.

  • input2_path (PathLike) – the file where nb_nearest features are looked for.

  • output_path (PathLike) – the file to write the result to

  • nb_nearest (int) – the number of nearest features from input 2 to join to input1.

  • distance (float) – maximum distance to search for the nearest items. If expand is True, this is the initial search distance, which will be gradually expanded (doubled) till nb_nearest are found. For optimal performance, it is important to choose the typical value that will be needed to find nb_nearest items. If distance is too large, performance can be bad. Parameter is only relevant if spatialite version >= 5.1 is used.

  • expand (bool) – True to keep searching till nb_nearest items are found. If False, only items found within distance are returned (False is only supported if spatialite version >= 5.1 is used).

  • input1_layer (str, optional) – input layer name. Optional if the file only contains one layer. Defaults to None.

  • input1_columns (List[str], optional) – list of columns to retain. If None, all standard columns are retained. In addition to standard columns, it is also possible to specify “fid”, a unique index available in all input files. Note that the “fid” will be aliased even if input1_columns_prefix is “”, eg. to “fid_1”. Defaults to None.

  • input1_columns_prefix (str, optional) – prefix to use in the column aliases. Defaults to “l1_”.

  • input2_layer (str, optional) – input layer name. Optional if the file only contains one layer. Defaults to None.

  • input2_columns (List[str], optional) – columns to select. If None is specified, all columns are selected. As explained for input1_columns, it is also possible to specify “fid”. Defaults to None.

  • input2_columns_prefix (str, optional) – prefix to use in the column aliases. Defaults to “l2_”.

  • output_layer (str, optional) – output layer name. If None, the output_path stem is used. Defaults to None.

  • nb_parallel (int, optional) – the number of parallel processes to use. Defaults to -1: use all available CPUs.

  • batchsize (int, optional) – indicative number of rows to process per batch. A smaller batch size, possibly in combination with a smaller nb_parallel, will reduce the memory usage. Defaults to -1: (try to) determine optimal size automatically.

  • force (bool, optional) – overwrite existing output file(s). Defaults to False.