Openstreetmap h3
Jump to navigation
Jump to search
openstreetmap_h3 a high performance tool for importing OSM PBF files into PostGIS databases or into Big Data ecosystem via Apache Arrow and Apache Parquet data formats. This project split planet dump geo data by H3 indexes into many partitions to simplify world wide data geo analysis/aggregation and routing tasks. H3 Hexagonal hierarchical geospatial indexing system attributes allow perform fast data partitioning/join/aggregation on 3,8 scale. Result dataset contains the following objects: nodes, ways, relations. There are no tables for historic data, only for the current data.
You can find more information in project repository openstreetmap_h3
What is openstreetmap_h3
- How to put the whole world in a regular laptop: PostgreSQL and OpenStreetMap as overview
- Roads and building density in North America. 100GB geodata processing OSM data in PostgreSQL(with columnar storage provided by open source Citus extension ).
- «Divide and Conquer» for OpenStreetMap world inside PostgreSQL details about data partitioning approach\
Apache Parquet schemas
Node:
scala> spark.read.parquet("/home/geo/arrow/nodes/*.parquet").printSchema root |-- id: long (nullable = true) |-- h33: short (nullable = true) |-- h38: integer (nullable = true) |-- latitude: double (nullable = true) |-- longitude: double (nullable = true) |-- tags: map (nullable = true) | |-- key: string | |-- value: string (valueContainsNull = true)
Way:
scala> spark.read.parquet("/home/geo/arrow/ways/*.parquet").printSchema root |-- id: long (nullable = true) |-- h33: short (nullable = true) |-- h38: integer (nullable = true) |-- latitude: double (nullable = true) |-- longitude: double (nullable = true) |-- tags: map (nullable = true) | |-- key: string | |-- value: string (valueContainsNull = true) |-- pointIdxs: array (nullable = true) | |-- element: long (containsNull = true) |-- h33Center: short (nullable = true) |-- closed: boolean (nullable = true) |-- building: boolean (nullable = true) |-- highway: boolean (nullable = true) |-- scale: float (nullable = true) |-- lineStringWkb: binary (nullable = true) |-- bboxWkb: binary (nullable = true) |-- h38Indexes: array (nullable = true) | |-- element: integer (containsNull = true) |-- bboxMinX: double (nullable = true) |-- bboxMaxX: double (nullable = true) |-- bboxMinY: double (nullable = true) |-- bboxMaxY: double (nullable = true)
relation:
scala> spark.read.parquet("/home/geo/arrow/relations/*.parquet").printSchema root |-- id: long (nullable = true) |-- tags: map (nullable = true) | |-- key: string | |-- value: string (valueContainsNull = true) |-- memberId: array (nullable = true) | |-- element: long (containsNull = true) |-- memberType: array (nullable = true) | |-- element: byte (containsNull = true) |-- memberRole: array (nullable = true) | |-- element: string (containsNull = true)
multipolygon:
scala> spark.read.parquet("/home/geo/arrow/multipolygon.parquet").printSchema root |-- id: long (nullable = true) |-- wkb_hex: string (nullable = true) |-- tags_json: string (nullable = true)
Alternatives to openstreetmap_h3
- Osm2pgsql
- Imposm
- OGR - OGR OSM driver with ogr2ogr
- Osm-parquetizer Transform PBF into Big Data friendly Apache Parquet format
- Osm2pgrouting
- Osmium, special fast C++ data processor
- Osmosis - can also do imports of osm file to postgres DB with postgis extension