OsmDB.pm
osmDB.pm is a module (in extension to osm.pm) to transfer data extracted from an osm file into a mysql database. It also provides functions to access this data. Of course some things are reinvented here...
Motivation
With the growing size of germany.osm (or osm in general) it is becoming harder by the day to handle all this data. Especially if not enough RAM is present. So I decided to move the data from the osm file (and RAM) to disk. On the one hand this will be convenient because all data is present. On the other hand this will significantly slow down some tasks. Otherwise I always wanted to do something with an sql server (again).
Setup
- Install mysql server
- change root password (maybe create user/password for osm use)
- create osmdb.ini in program folder and put in two lines (user=name AND password=abc)
- create database(s) (tables, indexes etc. will be handled by osmDB.pm)
- Install perl DBI module
- put osmDB.pm in module folder (like osm.pm)
Programs used
- bulkDB.pl (takes osm file and database name and transfers data to database)
- diffDB.pl (takes osc file and database name and updates data in database)
Performance
- germany.osm (11.4GB) could be transferred to a database in 2.5hrs on a (2GHz, 7200rpm) machine. That is a bulk load on indexed tables (ids, keys, values). Import of germany.osc (342MB, one week of changes) takes 1hr.
- Hamburg (208MB) on Atom 1.6GHz: bulk load 6.5mins. Apply diff of 3.3MB size (changes of one week) takes 50secs.
Diff files
To keep your data up to date you can either load a new extract or use two osm files and osmosis to create an osc file. This can be used to update your database. If you are using a planet file these osc files are provided on a regular basis.
Data recall
Single objects
dbConnect ($databaseName) ; my ($refProperties, $refTags) = getDBnode ($nodeId) ; my %properties = %$refProperties ; my @tags = @$refTags dbDisconnect ($databaseName) ;
Loops
dbConnect ($databaseName) ; loopInitNodes(null, null) ; # for all nodes # loopInitNodes("place", "city") ; # only place=city my $id = loopGetNextNode() ; my ($refProperties, $refTags) = getDBNode ($id) ; while (defined $properties) { ... # next $id = loopGetNextNode() ; ($refProperties, $refTags) = getDBNode ($id) ; } dbDisconnect ($databaseName) ;
Tags (key/values)
my ($name) = getTag ($type, $objectId, $key) ; my ($name2) = getTag ("node", 12345, "name") ; my ($ref) = getTag ("way", 12345, "ref") ;
Coordinates for waynodes
my ($lonRef, $latRef) = getDBWayNodesCoords ($wayId) ; # returns refs to hashes (keys = nodeId)
Data manipulation
deleteDBNode ($id) deleteDBWay ($id) deleteDBRelation ($id) storeDBNode ($propRef, $tagRef) storeDBWay ($propRef, $nodeRef, $tagRef) storeDBRelation ($propRef, $memberRef, $tagRef)
Tables
The following tables will be created:
- nodes
- nodetags
- ways
- waytags
- waynodes
- relations
- relationtags
- relationmembers
The ids, keys and values are indexed (k/v only first 12 characters to save time and space)