tebetebe: routing analysis with OSM

tebetebe is a Python API to compile, serve, and query routable networks using the Open Source Routing Machine (OSRM) and OpenStreetMap data, and provides a framework for routing analysis using these networks.

Package Overview

tebetebe makes it easy to compile a custom routing Scenario by abstracting OSRM executables into a pythonic API and provides a framework for routing analysis. With the range of customization available in the .lua configuration scripts, specific, accurate and readable transportation models can be developed and analyzed.

tebetebe also simplifies the routing analysis pipeline by enabling data to be pulled live from the OSM via the Overpass API and providing various user-contributed classes which automate common routing analysis tasks, such as isochrones.

Installation

1. Install osrm-backend binaries

2. (option 1) Install from pip

pip3 install tebetebe

2. (option 2) Clone tebetebe source code and install

git clone https://github.com/1papaya/tebetebe.git
python3 setup.py install

note: tebetebe will not work on Windows machines

Examples

Simple Scenario

This example uses the eSwatini GeoFabrik extract and the default walking profile to calculate a walking route between Simunye and Mbabane.

from tebetebe.profiles import foot
import tebetebe as tb

tb_env = tb.Environment(tmp_dir="./tmp/simple_scenario")

mbabane = (31.1367, -26.3054)
simunye = (31.9274, -26.2108)

## Initialize scenario using eSwatini GeoFabrik extract and default foot profile
scenario = tb_env.Scenario("./tmp/swaziland-latest.osm.pbf", foot)

## Compile and run scenario
with scenario() as api:
    ## Query OSRM HTTP `simple_route` service to calculate route
    route = api.simple_route(simunye, mbabane)

    duration = route['routes'][0]['duration'] / 60
    distance = route['routes'][0]['distance'] / 1000

    print("Walking from Simunye to Mbabane")
    print(" Duration: {:.2f} minutes".format(duration))
    print(" Distance: {:.2f} km".format(distance))
Simple Scenario output
 [   INFO] swaziland-latest_foot: Compiling scenario (MLD)
 [WARNING] swaziland-latest_foot: Default foot profile may not be accurate for your use case
 [   INFO] swaziland-latest_foot: Initializing scenario
 [   INFO] swaziland-latest_foot: Ready for requests
 Walking from Simunye to Mbabane
  Duration: 1420.70 minutes
  Distance: 118.39 km

Scenario Comparison

By comparing origin:destination routes between different scenarios, we gain insight about how changing a transportation scenario affects route patterns.

Here, we compare routes calculated by two different Scenarios: a “normal” walking scenario, and a “flood” scenario, to understand the impact of a flooding event on access to local schools in eSwatini. The route network is taken from the eSwatini GeoFabrik extract, and the homesteads (origins) and schools (destinations) are downloaded from the Overpass API.

The “normal” and “flood” scenarios in this case are both the default walk profile, except that the “flood” scenaro considers nodes with ford=yes (river crossings), and ways with flood_prone=yes to be a barrier. Check out their source in the GitHub repo

from tebetebe.analysis import ParallelScenarios, RouteComparison
import tebetebe as tb

tb_env = tb.Environment(tmp_dir="./tmp/scenario_comparison")

## Get route network from GeoFabrik extract and POIs from Overpass API
## Load the route origins (homesteads) and dests (schools) 8km around river crossing
crossing_node_id = 6750683291
highways = tb_env.OSMDataset("./tmp/swaziland-latest.osm.pbf", name="swazi")
homesteads = tb_env.POIDataset.from_overpass("""node({})->.crossing;
                                                 ( way(around.crossing:8000)["building"];);
                                                out center;""".format(crossing_node_id),
                                             name="homesteads")
schools = tb_env.POIDataset.from_overpass("""node({})->.crossing;
                                                 ( way(around.crossing:8000)["amenity"="school"];);
                                                out center;""".format(crossing_node_id),
                                          name="schools")

## Normal & Flood scenarios. Pass along an extra parameter to osrm-routed
## so that the max duration table size is enough for all origin:dest pairs
routed_args = {"max_table_size": len(homesteads) * len(schools) + 1}

normal = tb_env.Scenario(highways, "./profiles/walk_normal.lua",
                         routed_args=routed_args, name="normal")
flood = tb_env.Scenario(highways, "./profiles/walk_flood.lua",
                        routed_args=routed_args, name="flood")

## Run normal and flood scenarios in parallel
parallel_scenarios = ParallelScenarios(normal, flood)

with parallel_scenarios as scenarios:
    ## Compare origin:dest routes
    comparison = RouteComparison(origins=homesteads, dests=schools)

    ## Routes that are different between scenarios
    flood_affected = comparison.get_difference(normal, flood)

    ## Point dataframe of all homesteads who are flood affected
    homesteads[homesteads.index.isin(flood_affected['origin_id'].unique())] \
        .to_file(tb_env.tmp_dir / "flood_affected_homesteads.geojson", driver="GeoJSON")

    ## Routes of all flood affected homesteads under normal conditions
    comparison.get_routes(normal, od_pairs=flood_affected) \
        .to_file(tb_env.tmp_dir / "flood_affected_normal_routes.geojson", driver="GeoJSON")

    ## Routes of all flood affected homesteads under flood conditions
    comparison.get_routes(flood, od_pairs=flood_affected) \
        .to_file(tb_env.tmp_dir / "flood_affected_flood_routes.geojson", driver="GeoJSON")
Scenario Comparison output
 [   INFO] Downloading POIDataset homesteads
 [   INFO] Downloading POIDataset schools
 [   INFO] normal: Compiling scenario (MLD)
 [   INFO] normal: Initializing scenario
 [   INFO] normal: Ready for requests
 [   INFO] flood: Compiling scenario (MLD)
 [   INFO] flood: Initializing scenario
 [   INFO] flood: Ready for requests
_static/img/scenario-comparison.png

Scenario Comparison results visualized in QGIS. Flood Affected homesteads (Green), Normal Routes (Pink), Flood Routes (Yellow)

Access Isochrones

Here a route network is extracted from the eSwatini GeoFabrik extract and an AccessIsochrone is calculated using the default car profile. The AccessIsochrone class is provided by the python-osrm package.

from tebetebe.analysis import AccessIsochrone
from tebetebe.profiles import car
import tebetebe as tb

tb_env = tb.Environment(tmp_dir="./tmp/access_isochrones")
mbabane = (31.1367, -26.3054)

## Initialize scenario with GeoFabrik extract and default car profile
scenario = tb_env.Scenario("./tmp/swaziland-latest.osm.pbf", car)

## Compile and run scenario
with scenario() as api:
    isochrone = AccessIsochrone(api, mbabane, points_grid=1000, size=0.2)
    contours = isochrone.render_contour(10)

    ## Save contours as GeoJSON for visualization with QGIS
    contours.to_file(tb_env.tmp_dir / "contour10.geojson", driver="GeoJSON")
Access Isochrones output
 [   INFO] swaziland-latest_car: Compiling scenario (MLD)
 [WARNING] swaziland-latest_car: Default car profile may not be accurate for your use case
 [   INFO] swaziland-latest_car: Initializing scenario
 [   INFO] swaziland-latest_car: Ready for requests
_static/img/mbabane_isochrone.png

Result visualized over OSM Carto basemap


API Documentation

tb.Scenario

class tebetebe.Scenario.Scenario(osm_dataset, routing_profile, name=None, algorithm='MLD', tmp_dir=PosixPath('/tmp'), overwrite=False, verbose=False, **kwargs)[source]

Bases: object

Scenario is an abstraction of OSRM executables in order to compile, serve, and query a routable network.

A Scenario is initialized with (1) an OSM Dataset and (2) a Routing Profile. When called, Scenario compiles the OSMDataset and RoutingProfile into an OSRM routable network, provides a context manager to serve the scenario’s HTTP API, and returns various methods to query that API.

The HTTP API methods (match, nearest, simple_route, table, trip) are provided by the python-osrm module (https://github.com/ustroetz/python-osrm)

Example

>>> from tebetebe.profiles import foot
>>> import tebetebe as tb
>>>
>>> ## initialize scenario with GeoFabrik extract & default foot profile
>>> scenario = tb.Scenario("./swaziland-latest.osm.pbf", foot)
>>>
>>> with scenario() as api: ## compile scenario and serve HTTP API
>>>     api.simple_route()... ## query HTTP API for a simple route, match, nearest, trip...
Parameters
  • osm_dataset (str / OSMDataset) – OSM dataset from which the route network will be extracted

  • routing_profile (str / RoutingProfile) – Routing profile to be used in scenario

  • name (str, optional) – Scenario name used in output file and log entries. If not supplied, it will be built based upon the OSMDataset and RoutingProfile names

  • algorithm (str, optional) – Algorithm to be used (either “CH” or “MLD”)

  • tmp_dir (str, optional) – Temporary directory to store files generated by osrm binaries

  • overwrite (bool, optional) – Overwrite scenario if .osrm* files already exist. Otherwise, the existing .osrm* files will be used.

  • verbose (bool, optional) – Print output of OSRM compilation

  • **kwargs – To pass any custom arguments to the OSRM executables (“osrm-routed”, “osrm-contract”, …) pass the kwarg “{executable}_args” (ex. “routed_args”, “contract_args”, …) with a dictionary of key:values to be passed

get_name()[source]

Return the Scenario name

get_path()[source]

Return the Scenario path

is_alive()[source]

True if Scenario process is running, False otherwise

match(steps=False, overview='simplified', geometry='polyline', timestamps=None, radius=None, annotations='false', gaps='split', tidy=False, waypoints=None, url_config=http://localhost:5000/*/v1/driving)

Function wrapping OSRM ‘match’ function, returning the reponse in JSON

Parameters
  • points (list of tuple/list of point) – A sequence of points as (x ,y) where x is longitude and y is latitude.

  • steps (bool, optional) – Default is False.

  • overview (str, optional) – Query for the geometry overview, either “simplified”, “full” or “false” (Default: “simplified”)

  • geometry (str, optional) – Format in which decode the geometry, either “polyline” (ie. not decoded), “geojson”, “WKT” or “WKB” (default: “polyline”).

  • timestamps (list of timestamp, optional) –

  • radius (list of float, optional) –

  • annotations (bool, optional) –

  • gaps (str, optional) –

  • tidy (bool, optional) –

  • waypoints (list of tuple/list of point, optional) –

  • url_config (osrm.RequestConfig, optional) – Parameters regarding the host, version and profile to use

Returns

The response from the osrm instance, parsed as a dict

Return type

dict

nearest(number=1, url_config=http://localhost:5000/*/v1/driving)

Useless function wrapping OSRM ‘nearest’ function, returning the reponse in JSON

Parameters
  • coord (list/tuple of two floats) – (x ,y) where x is longitude and y is latitude

  • number (int, optional) –

  • url_config (osrm.RequestConfig, optional) – Parameters regarding the host, version and profile to use

Returns

result – The response from the osrm instance, parsed as a dict

Return type

dict

simple_route(coord_dest, coord_intermediate=None, alternatives=False, steps=False, output='full', geometry='polyline', overview='simplified', annotations='true', continue_straight='default', url_config=http://localhost:5000/*/v1/driving, send_as_polyline=True)

Function wrapping OSRM ‘viaroute’ function and returning the JSON reponse with the route_geometry decoded (in WKT or WKB) if needed.

Parameters
  • coord_origin (list/tuple of two floats) – (x ,y) where x is longitude and y is latitude

  • coord_dest (list/tuple of two floats) – (x ,y) where x is longitude and y is latitude

  • coord_intermediate (list of 2-floats list/tuple) – [(x ,y), (x, y), …] where x is longitude and y is latitude

  • alternatives (bool, optional) – Query (and resolve geometry if asked) for alternatives routes (default: False)

  • output (str, optional) – Define the type of output (full response or only route(s)), default : “full”.

  • geometry (str, optional) – Format in which decode the geometry, either “polyline” (ie. not decoded), “geojson”, “WKT” or “WKB” (default: “polyline”).

  • annotations (str, optional) –

  • continue_straight (str, optional) –

  • overview (str, optional) – Query for the geometry overview, either “simplified”, “full” or “false” (Default: “simplified”)

  • url_config (osrm.RequestConfig, optional) – Parameters regarding the host, version and profile to use

Returns

result – The result, parsed as a dict, with the geometry decoded in the format defined in geometry.

Return type

dict

table(coords_dest=None, ids_origin=None, ids_dest=None, output='np', minutes=False, annotations='duration', url_config=http://localhost:5000/*/v1/driving, send_as_polyline=True)

Function wrapping OSRM ‘table’ function in order to get a matrix of time distance as a numpy array or as a DataFrame

Parameters
  • coords_src (list) –

    A list of coord as (longitude, latitude) , like :
    list_coords = [(21.3224, 45.2358),

    (21.3856, 42.0094), (20.9574, 41.5286)] (coords have to be float)

  • coords_dest (list, optional) –

    A list of coord as (longitude, latitude) , like :
    list_coords = [(21.3224, 45.2358),

    (21.3856, 42.0094), (20.9574, 41.5286)] (coords have to be float)

  • ids_origin (list, optional) – A list of name/id to use to label the source axis of the result DataFrame (default: None).

  • ids_dest (list, optional) – A list of name/id to use to label the destination axis of the result DataFrame (default: None).

  • output (str, optional) –

    The type of annotated matrice to return (DataFrame or numpy array)

    ’raw’ for the (parsed) json response from OSRM ‘pandas’, ‘df’ or ‘DataFrame’ for a DataFrame ‘numpy’, ‘array’ or ‘np’ for a numpy array (default is “np”)

  • annotations (str, optional) – Either ‘duration’ (default) or ‘distance’

  • url_config (osrm.RequestConfig, optional) – Parameters regarding the host, version and profile to use

Returns

  • - if output==’raw’ (a dict, the parsed json response.)

  • - if output==’np’ (a numpy.ndarray containing the time in minutes,) – a list of snapped origin coordinates, a list of snapped destination coordinates.

  • - if output==’pandas’ (a labeled DataFrame containing the time matrix in minutes,) – a list of snapped origin coordinates, a list of snapped destination coordinates.

trip(steps=False, output='full', geometry='polyline', overview='simplified', roundtrip=True, source='any', destination='any', annotations='false', url_config=http://localhost:5000/*/v1/driving, send_as_polyline=True)

Function wrapping OSRM ‘trip’ function and returning the JSON reponse with the route_geometry decoded (in WKT or WKB) if needed.

Parameters
  • coord_origin (list/tuple of two floats) – (x ,y) where x is longitude and y is latitude

  • steps (bool, default False) –

  • output (str, default 'full') – Define the type of output (full response or only route(s))

  • geometry (str, optional) – Format in which decode the geometry, either “polyline” (ie. not decoded), “geojson”, “WKT” or “WKB” (default: “polyline”).

  • overview (str, optional) – Query for the geometry overview, either “simplified”, “full” or “false” (Default: “simplified”)

  • roundtrip (bool, optional) –

  • source (str, optional) –

  • destination (str, optional) –

  • annotations (str, optional) –

  • url_config (osrm.RequestConfig, optional) – Parameters regarding the host, version and profile to use

Returns

  • - if ‘only_index’ (a dict containing respective indexes) – of trips and waypoints

  • - if ‘raw’ (the original json returned by OSRM)

  • - if ‘WKT’ (the json returned by OSRM with the ‘route_geometry’ converted) – in WKT format

  • - if ‘WKB’ (the json returned by OSRM with the ‘route_geometry’ converted) – in WKB format

tb.OSMDataset

class tebetebe.OSMDataset.OSMDataset(osm_path, name=None, **kwargs)[source]

Bases: object

OSM data file from which a route network will be extracted

Parameters
  • osm_path – Path to *.osm{.pbf} dataset

  • name (str, optional) – Name of OSMDataset. If not provided, the .osm filename is used.

classmethod from_overpass(query, name=None, overwrite=False, tmp_dir=PosixPath('/tmp'), **kwargs)[source]

Initialize an OSMDataset by downloading result of an overpass query and saving as .osm

Parameters
  • query (str) – Query to be sent to overpass API. This query should not include an out directive (eg. [out:xml];)

  • name (str) – Name of the route network

  • overwrite (bool) – Overwrite route network if it already exists on disk

  • tmp_dir (str) – Temporary directory to save route network

Returns

Return type

OSMDataset

get_name()[source]

Return route network name

get_path()[source]

Return route network path

tb.RoutingProfile

class tebetebe.RoutingProfile.RoutingProfile(lua_path, name=None, default=False, **kwargs)[source]

Bases: object

A RoutingProfile is a configuration script which represents a routing behaviour, such as for bike or car routing. It describes whether or not to traverse a particular type of way or node in OSM data, and the speed at which those elements are traversed.

Check out the osrm-backend wiki for more information! https://github.com/Project-OSRM/osrm-backend/wiki/Profiles

Parameters
  • lua_path (str) – Path to .lua configuration script

  • name (str, optional) – Name of routing profile. If not provided, the .lua filename is used.

  • default (bool, optional) – If RoutingProfile is a default profile

get_name()[source]

Return name of routing profile

get_path()[source]

Return path of routing profile

is_default()[source]

Return if routing profile is a default or not

tb.POIDataset

class tebetebe.POIDataset.POIDataset(*args, name=None, **kwargs)[source]

Bases: geopandas.geodataframe.GeoDataFrame

Extension of a GeoDataFrame which stores Points only, to be used as origin, destination, and waypoints in routing.

Parameters

name (str) – Name of POIDataset (required)

classmethod from_features(features, name=None, **kwargs)[source]

Initialize POIDataset from GeoJSON features

classmethod from_file(path, name=None, **kwargs)[source]

Initialize POIDataset from file. If no name is given, the filename will be used

classmethod from_overpass(query, name=None, overwrite=False, tmp_dir=PosixPath('/tmp'), **kwargs)[source]

Initialize POIDataset from Overpass API query. Any returned nodes or closed ways with center attributes will be included in the dataset.

Parameters
  • query (str) – Query to be sent to overpass API. This query should not include an out directive (eg. [out:json];)

  • name (str) – Name of the POI dataset

  • overwrite (bool) – Overwrite POIDataset if it already exists on disk

  • tmp_dir (str) – Temporary directory to save POIDataset

get_name()[source]

Return POI dataset name

tb.Environment

class tebetebe.Environment.Environment(**kwargs)[source]

Bases: object

Environment is a convenience class to set default options for the tebetebe base classes.

Many of the tebetebe operations require writing a file to disk in order to be passed to the OSRM executables, and it’s useful to set a uniform tmp_dir for a particular set of commands so that all of these files generated by those commands are contained in the same place.

In addition to tmp_dir, any **kwarg passed to Environment will be passed to all of the classes available under the Environment. Useful examples of these kwargs include verbose and overwrite

Any **kwargs set in Environment classes will be overridden if the argument is specified during subclass initialization.

Parameters

**kwargs – Arbitrary keyword arguments to be passed to each class available under the Environment

tb.OSRM

class tebetebe.OSRM.OSRM(verbose=False)[source]

Bases: object

Base class wrapper around all the osrm-* binaries. All functions accept additional **kwargs to be passed upon execution. This class is used in the background and need not be initialized directly.

Parameters

verbose (bool) – Output stdout from osrm commands.

contract(osrm_file, **kwargs)[source]

Call osrm-contract on a .osrm file

customize(osrm_file, **kwargs)[source]

Call osrm-customize on a .osrm file

extract(osm_path, profile_path, **kwargs)[source]

Call osrm-extract with a path to the osm route network and lua profile.

get_version()[source]

Return OSRM binaries version

partition(osrm_file, **kwargs)[source]

Call osrm-partition for a .osrm file

routed(osrm_file, ready_callback, done_callback, verbose=False, **kwargs)[source]

Call osrm-routed on a .osrm file

Parameters
  • osrm_file (str) – Path to *.osrm

  • ready_callback (function) – Function to be called when osrm-routed is ready for HTTP requests

  • done_callback (function) – Function to be called when osrm-routed has exited

  • verbose (bool) – osrm-routed output is so verbose it is default off even if the parent class verbose=True. Set this to true if you want to see osrm-routed output anyway.

  • **kwargs – Any additional parameters to be passed to osrm-routed


Analysis Plugins

Analysis plugins are classes written on top of the tebetebe base classes to automate routing analysis

tb.analysis.AccessIsochrone

class tebetebe.analysis.AccessIsochrone.AccessIsochrone(scenario, point_origin, points_grid=500, size=0.4)[source]

Bases: osrm.extra.AccessIsochrone

Compute an access isochrone from an origin point with a given ScenarioAPI

Parameters
  • scenario (Scenario) – Scenario to be queried for access isochrone

  • point_origin (2-floats tuple) – The coordinates of the center point to use as (x, y).

  • points_grid (int) – The number of points of the underlying grid to use.

  • size (float) – Search radius (in wgs84 degrees)

get_center()[source]

Return center point used in isochrone calculations

get_durations()[source]

Return durations table retrieved from OSRM

get_grid()[source]

Return GeoDataFrame of grid used in duration calculations, snapped to the road network

render_contour(n_levels)[source]

Return GeoDataFrame of MultiPolygon contours for a specified number of levels

tb.analysis.ParallelScenarios

class tebetebe.analysis.ParallelScenarios.ParallelScenarios(*args)[source]

Bases: object

Context manager for serving multiple scenarios in parallel. This can be useful for comparing routing analysis from multiple scenarios side by side.

Parameters

*args – Arbitrary number of scenarios to be compiled, then executed in parallel

tb.analysis.RouteComparison

class tebetebe.analysis.RouteComparison.RouteComparison(origins, dests, origins_id_col=None, dests_id_col=None, cache=True)[source]

Bases: object

Compare route differences between scenarios

Parameters
  • origins (POIDataset) – Points to be used as the origins in route comparison

  • dests (POIDataset) – Points to be used as the destinations in route comparison

  • origins_id_col (str, optional) – Column in origins dataset to be used as ID. Must be unique. If not specified, the index will be used

  • dests_id_col (str, optional) – Column in dests dataset to be used as ID. Must be unique. If not specified, the index will be used

  • cache (bool, optional) – Whether to cache time matrices

get_difference(scenario0, scenario1)[source]

Get DF of origin:dest pairs whos routes differ between scenarios

get_duration_matrix(scenario, melted=False)[source]

Calculate a duration matrix between origins and dests

Parameters
  • scenario (Scenario) – Scenario to run calculate the duration matrix on

  • melted (bool, optional) – Whether the matrix should be returned as a matrix DF, or melted into origin_id dest_id and duration columns

Returns

Duration Matrix DataFrame

Return type

pd.DataFrame

get_duration_table(*args)[source]

Get route duration table for multiple scenarios

Parameters

*args – Arbitrary number of scenarios to run duration table

Returns

DF of origin:dest pairs and durations for each scenario

Return type

DataFrame

get_routes(scenario, od_pairs=None)[source]

Get routes between origins and dests

Parameters
  • scenario (Scenario) – Scenario on which the origin:dest routes will be calculated

  • od_pairs (pd.DataFrame, optional) – DataFrame of origin:dest pairs in two columns origin_id and dest_id for calculation. If not specified, all pairwise origin:dest routes will be calculated.

Returns

GDF with duration, distance, and route geometry for each origin:dest pair

Return type

GeoDataFrame

get_same(scenario0, scenario1)[source]

Get DF of origin:dest pairs whos routes are the same between scenarios