User:JamesBadger/Osm2pgsql/benchmarks
Importing the OSM data set using osm2pgsql. See Osm2pgsql/benchmarks.
Cloud Instance Import
Configuration
OpenStack IceHouse instance.
General settings
Setting | Value |
---|---|
Created | Aug 27 11:27 MDT |
Flavor | m1.xxlarge |
RAM | 32GB |
VCPUs | 8 |
Disk | 20GB |
OS | Ubuntu Server 14.04.1 LTS 64-bit |
Postgres Version | 9.3.5 |
osm2pgsql version | 901a996 (May 28 2014) |
Volumes | 32 GB at /planet , 200 GB at /work
|
Planetfile | PBF - 2014/08/20 |
OpenStack Glance volumes are mounted with ZFS on Linux. Default ZFS compression is enabled for /work
. ZFS recordsize=8k and primarycache=metadata for the work/postgresql
filesystem. PostgreSQL has been configured to use that as its data directory.
ZFS Pools and Filesystems
32 GB pool 'planet'
Name | Used by Dataset | Available | Compression Ratio | Mountpoint |
---|---|---|---|---|
planet | 25.5G | 5.73G | 1.00x | /planet |
200 GB pool 'work'
Name | Used by Dataset | Available | Compression Ratio | Mountpoint | Properties |
---|---|---|---|---|---|
work | 32K | 196G | 1.97x | /work | compression=lz4, atime=off |
work/log | 31K | 196G | 1.00x | /work/log | inherit |
work/log/postgresql | 30K | 196G | 1.00x | /work/log/postgresql | inherit |
work/postgresql | 41.8M | 196G | 2.76x | /work/postgresql | recordsize=8k, primarycache=metadata |
PostgreSQL configuration
Setting | Value |
---|---|
shared_buffers | 8MB |
temp_buffers | 64MB |
work_mem | 1MB |
maintenance_work_mem | 4096MB |
effective_io_concurrency | 1 |
fsync | off |
synchronous_commit | off |
full_page_writes | off |
checkpoint_segments | 20 |
checkpoint_completion_target | 0.9 |
random_page_cost | 3.0 |
effective_cache_size | 18GB |
log_destination | csvlog |
logging_collector | on |
log_directory | '/work/log/postgresql' |
log_filename | 'postgresql-%Y-%m-%d_%H%M%S.log' |
log_rotation_age | 1d |
log_rotation_size | 1GB |
log_min_duration_statement | 250ms |
log_checkpoints | on |
log_connections | on |
log_disconnections | on |
log_duration | on |
log_line_prefix | '%t [%p]:[%l] user=%u, db=%d ' |
log_lock_waits | on |
log_temp_files | 0 |
autovacuum | off |
autovacuum_vacuum_scale_factor | 0.04 |
autovacuum_analyze_scale_factor | 0.02 |
Run
$ sudo -u postgres -i postgres$ time osm2pgsql --slim -d gis -C 20000 --flat-nodes /work/nodes.cache --number-processes 6 /planet/planet-140827.osm.pbf osm2pgsql SVN version 0.85.0 (64bit id space) Using projection SRS 900913 (Spherical Mercator) Setting up table: planet_osm_point NOTICE: table "planet_osm_point" does not exist, skipping NOTICE: table "planet_osm_point_tmp" does not exist, skipping Setting up table: planet_osm_line NOTICE: table "planet_osm_line" does not exist, skipping NOTICE: table "planet_osm_line_tmp" does not exist, skipping Setting up table: planet_osm_polygon NOTICE: table "planet_osm_polygon" does not exist, skipping NOTICE: table "planet_osm_polygon_tmp" does not exist, skipping Setting up table: planet_osm_roads NOTICE: table "planet_osm_roads" does not exist, skipping NOTICE: table "planet_osm_roads_tmp" does not exist, skipping Using built-in tag processing pipeline Allocating memory for dense node cache Allocating dense node cache in one big chunk Allocating memory for sparse node cache Sharing dense sparse Node-cache: cache=20000MB, maxblocks=2560000*8192, allocation method=11 Mid: loading persistent node cache from /work/nodes.cache Allocated space for persistent node cache file Maximum node in persistent node cache: 0 Mid: pgsql, scale=100 cache=20000 Setting up table: planet_osm_nodes NOTICE: table "planet_osm_nodes" does not exist, skipping Setting up table: planet_osm_ways NOTICE: table "planet_osm_ways" does not exist, skipping Setting up table: planet_osm_rels NOTICE: table "planet_osm_rels" does not exist, skipping Reading in file: /planet/planet-140827.osm.pbf Processing: Node(2496510k 498.2k/s) Way(249511k 10.06k/s) Relation(708550 31.34/s) Standard exception processing way_id 1382119: TopologyException: side location conflict at 1189651.77 8876718.8800000008 Processing: Node(2496510k 498.2k/s) Way(249511k 10.06k/s) Relation(2442080 33.79/s) Standard exception processing way_id 3631463: TopologyException: side location conflict at 1906862.01 6654069.9800000004 Processing: Node(2496510k 498.2k/s) Way(249511k 10.06k/s) Relation(2552370 34.24/s) Standard exception processing way_id 3757788: TopologyException: side location conflict at 1322274.0800000001 5383303.8399999999 Processing: Node(2496510k 498.2k/s) Way(249511k 10.06k/s) Relation(2672440 34.37/s) Standard exception processing way_id 3890452: TopologyException: side location conflict at 1038877.48 5779728.6200000001 Processing: Node(2496510k 498.2k/s) Way(249511k 10.06k/s) Relation(2765210 34.59/s) parse time: 109760s Node stats: total(2496510292), max(3044821862) in 5011s Way stats: total(249511922), max(300402050) in 24800s Relation stats: total(2765219), max(3994576) in 79949s Committing transaction for planet_osm_point Committing transaction for planet_osm_line Committing transaction for planet_osm_polygon Committing transaction for planet_osm_roads Going over pending ways... Maximum node in persistent node cache: 3045064703 160218658 ways are pending Using 6 helper-processes Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Helper process 4 out of 6 initialised Helper process 5 out of 6 initialised Helper process 1 out of 6 initialised Helper process 2 out of 6 initialised Helper process 3 out of 6 initialised Helper process 0 out of 6 initialised Process 5 finished processing 26703109 ways in 76486 sec Process 0 finished processing 26703110 ways in 76486 sec Process 4 finished processing 26703109 ways in 76486 sec Process 1 finished processing 26703110 ways in 76486 sec Process 2 finished processing 26703110 ways in 76486 sec Process 3 finished processing 26703110 ways in 76486 sec Maximum node in persistent node cache: 3045064703 Maximum node in persistent node cache: 3045064703 Maximum node in persistent node cache: 3045064703 Maximum node in persistent node cache: 3045064703 Maximum node in persistent node cache: 3045064703 All child processes exited 160218658 Pending ways took 76491s at a rate of 2094.61/s Committing transaction for planet_osm_point Committing transaction for planet_osm_line Committing transaction for planet_osm_polygon Committing transaction for planet_osm_roads Going over pending relations... Maximum node in persistent node cache: 3045064703 0 relations are pending Using 6 helper-processes Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Mid: loading persistent node cache from /work/nodes.cache Maximum node in persistent node cache: 3045064703 Process 4 finished processing 0 relations in 13 sec Maximum node in persistent node cache: 3045064703 Process 5 finished processing 0 relations in 14 sec Maximum node in persistent node cache: 3045064703 Process 3 finished processing 0 relations in 14 sec Maximum node in persistent node cache: 3045064703 Process 0 finished processing 0 relations in 14 sec Process 1 finished processing 0 relations in 14 sec Maximum node in persistent node cache: 3045064703 Process 2 finished processing 0 relations in 14 sec Maximum node in persistent node cache: 3045064703 All child processes exited 0 Pending relations took 15s at a rate of 0.00/s node cache: stored: 2343195456(93.86%), storage efficiency: 89.39% (dense blocks: 2387255, sparse nodes: 88445625), hit rate: 92.85% Sorting data and creating indexes for planet_osm_line Sorting data and creating indexes for planet_osm_point Sorting data and creating indexes for planet_osm_polygon Sorting data and creating indexes for planet_osm_roads Maximum node in persistent node cache: 3045064703 Stopping table: planet_osm_nodes Stopped table: planet_osm_nodes in 0s Stopping table: planet_osm_rels Stopping table: planet_osm_ways Building index on table: planet_osm_rels (fastupdate=off) Building index on table: planet_osm_ways (fastupdate=off) Analyzing planet_osm_point finished Analyzing planet_osm_roads finished Analyzing planet_osm_polygon finished Analyzing planet_osm_line finished Stopped table: planet_osm_rels in 2256s Copying planet_osm_point to cluster by geometry finished Creating geometry index on planet_osm_point Copying planet_osm_roads to cluster by geometry finished Creating geometry index on planet_osm_roads Creating osm_id index on planet_osm_roads Creating osm_id index on planet_osm_point Creating indexes on planet_osm_roads finished All indexes on planet_osm_roads created in 31410s Completed planet_osm_roads Creating indexes on planet_osm_point finished All indexes on planet_osm_point created in 35766s Completed planet_osm_point Copying planet_osm_line to cluster by geometry finished Creating geometry index on planet_osm_line Creating osm_id index on planet_osm_line Creating indexes on planet_osm_line finished All indexes on planet_osm_line created in 157911s Completed planet_osm_line Copying planet_osm_polygon to cluster by geometry finished Creating geometry index on planet_osm_polygon Creating osm_id index on planet_osm_polygon Creating indexes on planet_osm_polygon finished All indexes on planet_osm_polygon created in 232064s Completed planet_osm_polygon Stopped table: planet_osm_ways in 265753s Osm2pgsql took 452331s overall real 7538m51.173s user 2426m23.276s sys 262m43.906s
7538 minutes is 5 days, 5 hours, and 38 minutes.
I estimate the bottleneck was disk I/O, which is network based on this OpenStack cloud.
Post-Run Data Usage
These charts were made using PNP4Nagios and NCSA scripts running on the host to push data to a remote Nagios instance. NCSA scripts were run every 10-15 minutes. Note that the "Free Space" charts are inverted and actually display space usage on the filesystem.
Post-Run Info
This server was terminated after being unable to properly vacuum the database. A "VACUUM ANALYZE VERBOSE" command would not finish, even after taking a day to run. This is likely due to the extremely limited disk IO.
Intel Server Import
Configuration
Circa-2010 Intel Server.
General settings
Setting | Value |
---|---|
RAM | 24 GB DDR3 SDRAM |
Motherboard | Intel S5520SCR |
Chassis | Intel SC5600BRP |
CPUs | 1 x Xeon E5530 2.4GHz |
Cores | 8 |
RAID Card | Intel SRCSASBB8I |
Disk | 2x300 GB 15000 RPM Disks in Hardware RAID 1 (SAS)
4x2000 GB 5400-7200 RPM WD RED Disk as individual RAID 0 drives (SATA), then used to build a RAID 10 ZFS Pool |
OS | Ubuntu Server 14.04.1 LTS 64-bit |
Postgres Version | 9.3.5 |
osm2pgsql version | 901a996 (May 28 2014) |
Planetfile | PBF - 2014/09/03 |
PostgreSQL data directory at /var/lib/postgresql
on RAID 1. PostgreSQL logs at /work/log/postgresql
on ZFS Pool. Node cache for osm2pgsql at /opt/osm/nodes.cache
on RAID 1. Planet file at /work/planet/planet-140903.osm.pbf
on ZFS Pool. Compression enabled on ZFS Pool.
PostgreSQL configuration
Setting | Value |
---|---|
shared_buffers | 8MB |
temp_buffers | 64MB |
work_mem | 48MB |
maintenance_work_mem | 4096MB |
effective_io_concurrency | 2 |
fsync | off |
synchronous_commit | off |
full_page_writes | off |
checkpoint_segments | 20 |
checkpoint_completion_target | 0.9 |
effective_cache_size | 16GB |
autovacuum | on |
autovacuum_vacuum_scale_factor | 0.04 |
autovacuum_analyze_scale_factor | 0.02 |
Autovacuum is enabled because I do not want to have to run a manual vacuum afterwards that will take who know how long.
Run
Note that only data above 45˚ N is imported on this run; that is the area of interest for our implementation. Hstore is also enabled.
$ sudo -u postgres -i $ service postgresql start $ createuser osm $ createdb -E UTF8 -O osm gis $ psql -f /usr/share/postgresql/9.3/contrib/postgis-2.1/postgis.sql -d gis $ psql -d gis -c "ALTER TABLE geometry_columns OWNER TO osm; ALTER TABLE spatial_ref_sys OWNER TO osm;" $ psql -d gis -c "CREATE EXTENSION hstore;" $ psql -f /usr/share/osm2pgsql/osm2pgsql/900913.sql -d gis $ time osm2pgsql --slim -d gis -C 16000 --number-processes 8 --bbox -180,45,180,90 --hstore --flat-nodes /opt/osm/nodes.cache /work/planet/planet-140903.osm.pbf
Post-Run Data Usage
TDB
Data Update Statistics
Info on updating the DB to newer versions using changesets. TBD.