HowTo minutely hstore

From OpenStreetMap Wiki
Jump to navigation Jump to search

broom

This article or section may contain out-of-date information: Warning! Outdated! this Article is from January 2011 and very outdated. For more recent information please visit https://switch2osm.org/serving-tiles/updating-as-people-edit/
If you know about the current state of affairs, please help keep everyone informed by updating this information. (Discussion)

Minutely Mapnik is the concept of keeping your PostgreSQL database in sync with changes made on the OpenStreetMap server. Changes made by OpenStreetMap user accounts are recorded in "replication files" by the minute, by the hour and daily. The Osmosis java application is used to automate the process of downloading the proper "replication files" that are used to update your database with new changes. The Osm2pgsql or Osmosis application is then used to import those changes into your PostgreSQL database.


The procedure for keeping your OpenStreetMap database in sync with the OpenStreetMap servers was changed on January 7, 2010. This page reflects the procedures necessary after that date. To view the old procedure please see Minutely_Mapnik_Pre2010.

Requirements

To use these "replication files" you will need the tools Osmosis and Osm2pgsql. Osm2pgsql you should already have because it would have been used for the initial import of OpenStreetMap data into your PostgreSQL database. It is probably a good idea to make sure you have the latest version of both of these tools. You will also need to know the date and time the snapshot of the OpenStreetMap data you originally imported into your database. The last requirement is that you imported your original data with the --slim argument of Osm2pgsql. You can find out the date and timestamp of your imported data by checking the second line in the planet.osm file. Look for the tag "timestamp".

Initial Setup

Initializing

We are first going to specify an environment variable to contain the path to the working directory Osmosis will use.

export WORKDIR_OSM=$HOME/.osmosis

If you are using a different environment or do not wish to use an environment variable simply replace $WORKDIR_OSM in the statements below with a full path.

We first need to setup your system to work with "replication files". Osmosis has built-in functionality to do this for you:

mkdir $WORKDIR_OSM
osmosis --read-replication-interval-init workingDirectory=$WORKDIR_OSM

The workingDirectory argument above specifies which directory you want Osmosis to create support files necessary for keeping your data up to date. In this example it is placed in a hidden directory in the user's home folder. Once you have everything setup and working you normally do not need to access these support files again so this location keeps things tidy.

After the above line is executed the workingDirectory you specified will be populated with two files: configuration.txt and download.lock. The configuration.txt will be explained below. The download.lock file is used to make sure there is only one task at a time trying to get "replication files".

Specifying Replication Interval

We now need to specify how far back in the past to collect all the changes to the OpenStreetMap server data. This is where you need to know the date and time of the snapshot you used for your original import of OpenStreetMap data. There are a few ways you can use to set this specification based on your snapshot timestamp all of which will result in a file named state.txt in the $WORKDIR_OSM location being created. NOTE: You will need to specify the date and time in UTC and show subtract an hour or two from your timestamp to make sure you do not miss any changes. It is no problem to specify a date and time earlier than the actual timestamp of your initial data import. Osmosis will download too many files then but you're not breaking anything by doing this.

state.txt with Browser

You can use Peter Körner's website tool (source here). After entering the date and time on that page click on "fetch state-file" and create a file named state.txt in the $WORKDIR_OSM location with the contents displayed on that page.

Fix replication download location

At the time of the license change (summer 2012) a restructuring of the planet-website took place. If you're using an old version of Osmosis you'll need to change this. Make sure that the BaseURL in $WORKDIR_OSM/configuration.txt is set to https://planet.openstreetmap.org/replication/minute/ or https://planet.openstreetmap.org/replication/hour/ or https://planet.openstreetmap.org/replication/day/ . The change from http to https was made in January 2018, and isn't yet supported by osmosis. The "openstreetmap-tiles-update-expire" that the switch2osm instructions use ensures that this change is also made.

Caution: replication will work since 2018-05-07 only with https protocol. Osmosis version 0.44.1 supports https protocol. Not tested with older versions.[1]

[1] https://lists.openstreetmap.org/pipermail/dev/2018-April/030193.html

Choose replication file interval

By default, osmosis will fetch minutely replication diffs, and at most one hours worth of them. Should you want anything else, edit the $WORKDIR_OSM/configuration.txt and change "replication/minute/" in the baseUrl to "replication/hour" or "replication/day". If you have lots to catch up, you can even set the value maxInterval=3600 higher. This controls how much data Osmosis will download in one run.

Acquire Replication Data

You are now ready to acquire and import the changes to the data that have happened on the OpenStreetMap server since the timestamp of your original data.

osmosis --read-replication-interval workingDirectory=$WORKDIR_OSM --simplify-change --write-xml-change changes.osc.gz

The --simplify-change argument tells Osmosis to only include the last change made to the same "feature" in the changes.osc.gz. Without this argument if the "replication files" that were downloaded had a "feature" modified more then once then it may be duplicated in your database.

Import Replication Data using Osm2Pgsql

The changes.osc.gz file can now be imported into your database. You will need to fill in part of the arguments customized to your own setup. You may be able to use the same arguments you used with your initial import but make sure to add the --append argument and specify the changes.osc.gz file name. Without the --append argument your OpenStreetMap database will be cleared of data before importing the changes.

osm2pgsql --append [my customized arguments] changes.osc.gz

Alternatively you can combine the previous two steps into one by piping the output of osmosis to osm2pgsql:

osmosis --read-replication-interval workingDirectory=$WORKDIR_OSM --simplify-change \
 --write-xml-change - | osm2pgsql --append [my customized arguments] -

Import Replication Data using Osmosis

If you are using only Osmosis, then you can import the data with

osmosis --read-xml-change file="changes.osc.gz" --write-pgsql-change user="<USER>" database="<DATABASE>" password="<PASSWORD>"

Potential Problems

Increasing Database Size

Importing into a database containing an extract of OSM data rather than the whole planet (using the osm2pgsql --bbox paramater) can cause the size of the database to increase significantly. The solution is to remove any ways and relations that refer only to nodes that are not in the database as described here.

See also