Antonio Cotroneo
Mar 30, 2021

Analyze OpenStreetMap Data with OSMnx and OmniSci Free

Try HeavyIQ Conversational Analytics on 400 million tweets

Download HEAVY.AI Free, a full-featured version available for use at no cost.

GET FREE LICENSE

NOTE: OMNISCI IS NOW HEAVY.AI

Earlier this month, our team shared 4 Simple Ways to Map Vehicle Location Data with OmniSci Free

In that post, we construct an Enriched Streets map using the geospatial analysis results of OmniSci's recently enhanced spatial relationship functions. 

We're following up to show you how we access and load OpenStreetMap (OSM) data into OmniSci Free with python, synthesize streets with vehicle collisions, and deliver meaningful spatial insights.

In this post, you'll learn how to:

  • Access OSM street network data via OSMnx, a python package used to retrieve, model, analyze, and visualize street networks from OpenStreetMap.
  • Load OSM data into OmniSci with pymapd, a python DB API compliant interface for OmniSci. 
  • And construct buffer geometries around OSM street network segments and geospatially enrich them with ten years worth of collision data using SQL.

Access OSM Street Network Data via OSMnx

Acquiring street network data can be a costly affair. Either you or your organization is willing to pay premiums for proprietary datasets, or you're paying with the time it takes to track down data curated by a governing body or a local municipality.

On the other hand, OpenStreetMap is freely available and obtainable in a variety of ways. You can download extracts from sites like Geofabrik or use tools like osm2pgsql to load data directly into PostGIS compliant databases. 

In this example, we use OSMnx and OmniSci's Data Science Foundation to access an analysis-ready cut of Los Angeles street network data. OmniSci provides deep integration with JupyterLab. Users can access JupyterLab by clicking an icon in Immerse or sending SQL queries within the SQL Editor directly to a notebook.

To begin, build a new JupyterLab connection and import the required packages. 


Connect to JupyterLab button in Immerse



JupyterLab's connection to the database is instantaneous on launch.


Ibis connection established upon launch of JupyterLab integration


Alternatively, a user could connect to an OmniSci instance using pymapd or ibis in a local JupyterLab or Jupyter Notebook environment.


Next, use a short snippet of code to load the street network data for the area of interest. OSMnx offers the ability to access street networks by providing one of the following:

  • bounding box
  • latitude/longitude plus a distance
  • address plus a distance
  • street network boundary polygon
  • place name or list of place names

Apply the place name and drive network type options to isolate the Los Angeles metro's drivable streets.


OSMnx's plot_graph function is a great way to preview the streets once they've loaded.


Los Angeles street network edges


Pymapd accepts data loaded into a database table from a Pandas DataFrame, but OSMnx's graph module retrieves spatial network data and models them as NetworkX MultiDiGraphs. To comply, convert the graph nodes and edges to Geopandas GeoDataFrames using the package's graph utility functions and drop the unwanted columns.


Check the coordinate reference system (CRS) to ensure it's in an acceptable format, such as EPSG 4326, before ingesting the streets into OmniSci. 

If everything looks good, it's go-time!



Street network edges GeoDataFrame CRS


Create a table to match the GeoDataFrame using Immerse's SQL Editor.


Finally, use pymapd's load_table_columnar function to load the streets into the la_streets table.


Los Angeles streets in Immerse’s LineMap Chart


Construct Street Buffers and Geospatially Enrich Them with Ten Years of Collision Data

Apply geometry and spatial relationship functions to buffer the Los Angeles street network data and spatially enrich those buffers to understand the following statistics per road segment better:

  • number of individuals who perished
  • number of injuries
  • average party count of collisions
  • number of severe injuries
  • number of collisions, deaths, and injuries with;
    - pedestrians
    - bicycles
    - motorcycles
    - trucks
  • number of collisions that involved alcohol

A geometry buffer creates a buffer polygon around the input geometry at a specified distance in meters. In this exercise, we buffer each street segment by 10 meters, providing a catchment area for the collisions and supporting effective visualization post-analysis.


Spatial relationships depend on geometry locations and their topological or distance relationship with one another. For instance, one may want to generate summary statistics for points that intersect with a set of polygons. 

Measure the relationship between the Los Angeles street buffers and ten years of California Statewide Integrated Traffic Records System (SWITRS) collisions data by employing the ST_CONTAINS function. 

ST_CONTAINS returns true if the first stated geometry object contains the second object. Aggregate a mixture of collision metrics to the street segment buffers and measure the relationship to complete the enrichment.



Bonus: Summarize 550 million+ Vehicle Locations by Census Block Groups

If you've stayed with us this far, we're throwing in a bonus analysis. Let's quickly run through the following:

  • Gather (data download link) and ingest Los Angeles, California Census Block Groups with Immerse.
  • Summarize 550 million+ vehicle locations by census block groups with SQL.

Drag and drop a Los Angeles metro Census Block Groups shapefile into the Immerse Data Manager's Data Importer using the import data from a local file option.

Once the block groups are loaded, perform a similar analysis to the Los Angeles streets using spatial relationship functions.


The Summarized by Census Block Group map from the 4 Simple Ways to Map Vehicle Location Data with OmniSci Free post results from this bonus analysis.


The bonus analysis highlights the performance of OmniSci's recently enhanced spatial relationship functions against the scale of high-volume vehicle location information. 

Now it's up to you!

Deploy OmniSci Free and try out this workflow for yourself. If you do, let us know your thoughts through LinkedIn, Twitter, or our Community Forums.


Antonio Cotroneo

Antonio Cotroneo is the Director of Product Marketing at HEAVY.AI. He has spent his career helping people around the world maximize their geospatial data, mapping technology, and spatial analyses to make critical decisions for their customers and community. He currently lives in Charlotte, NC with his wife and two children.