geojsplit package

Submodules

geojsplit.geojsplit module

Module for geojson streaming logic

Makes use of the excelent ijson library to stream and parse into python objects a JSON document starting at the features item. This assumes that a geojson is in the form

{
    "type": "FeatureCollection",
    "features": [
        { ... },
        ...
    ],
    "properties
}
class geojsplit.geojsplit.GeoJSONBatchStreamer(geojson: Union[str, pathlib.Path])

Bases: object

Wrapper class around ijson iterable, allowing iteration in batches

geojson

Filepath for a valid geojson document.

Type

Union[str, Path]

__init__(geojson: Union[str, pathlib.Path]) → None

Constructor for GeoJSONBatchStreamer

Parameters

geojson (Union[str, Path]) – Filepath for a valid geojson document. Will attempt to convert to a Path object regardless of input type.

Raises

FileNotFoundError – If geojson does not exist.

stream(batch: Optional[int] = None, prefix: Optional[str] = None) → Iterator[geojson.feature.FeatureCollection]

Generator method to yield batches of geojson Features in a Feature Collection.

Parameters
  • batch (Optional[int], optional) – The number of features in a single batch. Defaults to 100.

  • prefix (Optional[str], optional) – The prefix of the element of interest in the geojson document. Usually this should be ‘features.item’. Only change this if you now what you are doing. See https://github.com/ICRAR/ijson for more info. Defaults to ‘features.item’.

Yields

(Iterator[geojson.feature.FeatureCollection]) – The next batch of features wrapped in a new Feature Collection. This itself is just a subclass of a Dict instance, containing typical geojson attributes including a JSON array of Features. When StopIteration is raised, will yield whatever has been gathered so far in the data variable to ensure all features are collected.

Module contents