GTFS SQL examples
Introduction
gtfs-sql-importer is a tool that can import GTFS file into postgis-enabled database. Check out documentation before how to import it. Once you import both GTFS into your database and import your .PBF using osm2pgsql, you can query them both and extract interesting information.
Examples here assume that you imported GTFS into gtfs
schema and that osm2pgsql is in public
(this is default behaviour).
Examples
Merging all stops in GTFS and OSM and getting their distance
Tries to match stop_id from GTFS and ref
from OSM to get stop from OSM. Ideally, it should match gtfs_id, not ref.
WITH belgrade_osm_stops AS (
-- filter only Belgrade (from serbia.pbf)
SELECT *
FROM planet_osm_point
WHERE (highway='bus_stop' OR railway='tram_stop')
AND st_transform(way, 4326) && ST_MakeEnvelope(20.1, 44.63, 20.77, 45.01, 4326)
),
joined AS (
-- join stop_id from GTFS and ref from OSM
SELECT *,
st_transform(osm_stops.way, 4326) points4326,
st_distance(ST_Transform(stops.the_geom::geometry, 3857), osm_stops.way) AS distance
FROM gtfs.stops stops
LEFT JOIN belgrade_osm_stops osm_stops ON stops.stop_id=osm_stops.ref
)
SELECT stop_id, stop_name, the_geom AS gtfs_geo, osm_id, name AS osm_name, ref, points4326, distance
FROM joined
ORDER BY distance
All stops from GTFS missing in OSM in Belgrade
Building on top of query above, this one find only stops where there is no match with OSM and those stops can be exported to CSV and
imported further into QGIS/JOSM to figure out is there are really missing stops or just ref
tag is missing.
WITH belgrade_osm_stops AS (
-- filter only Belgrade (from serbia.pbf)
SELECT *
FROM planet_osm_point
WHERE (highway='bus_stop' OR railway='tram_stop')
AND st_transform(way, 4326) && ST_MakeEnvelope(20.1, 44.63, 20.77, 45.01, 4326)
),
joined AS (
-- join with GTFS data
SELECT *,
st_distance(ST_Transform(stops.the_geom::geometry, 3857), osm_stops.way) AS distance
FROM gtfs.stops stops
LEFT JOIN belgrade_osm_stops osm_stops ON stops.stop_id=osm_stops.ref
),
missing_stops AS (
-- filter only those that do not exist
SELECT stop_id, stop_name, the_geom AS gtfs_geo
FROM joined
WHERE distance IS null
),
stops_with_routes AS (
-- join with trip and routes, so we get all lines that goes through those missing stops
SELECT ms.stop_id, ms.stop_name, ms.gtfs_geo, string_agg(DISTINCT r.route_short_name, ',') AS routes
FROM gtfs.trips t
INNER JOIN gtfs.stop_times st ON st.trip_id = t.trip_id
INNER JOIN missing_stops ms ON ms.stop_id = st.stop_id
INNER JOIN gtfs.routes r ON t.route_id = r.route_id
GROUP BY ms.stop_id, ms.stop_name, ms.gtfs_geo
)
SELECT stop_id, stop_name, ST_X(gtfs_geo) AS long, ST_Y (gtfs_geo) AS lat, routes
FROM stops_with_routes