Daily update an OSM XML file
This page describes how to keep a local OSM file up-to-date.
- Please note that there are alternative update strategies. For further details refer to the all-purpose program Osmosis and to the specialized update program osmupdate.
Purpose
Why should you update an OSM file every day? Some Applications need an OSM file as input. Of course, you can download a new copy of the file every time you need one, but this could cause a lot of data traffic.
- You should not do this daily update if...
- you keep every available OSM item in the postgreSQL database which is directly kept up-to-date with Osmosis,
- you need minutely updates,
- you need an up-to-date OSM file only every other month or
- you need only a very small region, e.g. a city, for which you could easily download its .pbf or .osm.bz file every day.
- You should do this daily update in every other case, especially if...
- you do filtering with osmfilter on a regular basis,
- you want to minimize data traffic,
- you want to minimize hard disk space,
- your computer has not much main memory or
- you want to reduce CPU load (e.g., having an Intel Atom or a virtual server).
Prerequisites
Hardware
A weak CPU, e.g. Intel Atom or a virtual Internet server, will suffice. 1 GB RAM is recommended, but 512 MB will be enough if you reduce the required main memory for the program osmconvert a bit (add parameter -h=300).
Operating System
We assume that you use Linux as operating system. However, the Software you need is available for Windows too. The commands will differ slightly.
Prepare your System
At first, you will have to create a new directory for all the stuff we are going to do. If you do not have a specific preference, name the new directory "osmupdate" and create it in your home directory:
mkdir ~/osmupdate cd ~/osmupdate
Get osmconvert Program
This programs will be needed: osmconvert. You can download it as binary, however, it is recommended to download and compile the source code because the binary may be out of date. To do this – downloading and compiling – from the command line, enter this command:
wget -O - http://m.m.i24.cc/osmconvert.c |cc -x c - -lz -o osmconvert
Get a Border Polygon
As soon as you have chosen a geographical region, try to get a border polygon for it. A couple of polygons are available at openstreetmap.org. If there is no suitable polygon, you can create a new one. Or you decide to use a bounding box instead. In this case, you will have to replace the "-B=a.poly" in the following commands with e.g. "-b=11,49,12,49.5".
If you want to create an easy border polygon by hand, open a new file with the name a.poly and enter the corner points of the polygon, using the following format. For example:
-1 57 2 57 3.5 56.3 3 55 -1.2 55
(Be sure to start every line with a space character.)
In the following example we will download a border polygon for Germany and name it "a.poly":
wget -O a.poly https://trac.openstreetmap.org/export/24667/applications/utils/osm-extract/polygons/germany.poly
Get the OSM XML File
You should try to download an OSM file of the region you chose. Regional OSM files are available through several servers; for a list see Planet. For Germany, there is a file available at geofabrik.de. Each of the following commands will download the file, clip it to the selected region and store it into an .o5m file with the name a.o5m in one run. The first command will be considerably faster because PBF format is more efficient than .osm this purpose.
wget -O - https://download.geofabrik.de/osm/europe/germany.osm.pbf |osmconvert - -B=a.poly --out-o5m >a.o5m
wget -O - https://download.geofabrik.de/osm/europe/germany.osm.bz2 |bunzip2 |./osmconvert -B=a.poly --out-o5m >a.o5m
If there is no file for your region, choose a larger one which covers your region or choose the whole planet.osm file.
wget -O - http://planet.osm.org/pbf-experimental/planet-latest.osm.pbf | ./osmconvert - -B=a.poly --out-o5m >a.o5m
Although .pbf files are packed originally, we choose to reformat them because using the .o5m format will speed up the processing a bit.
After having downloaded and converted the OSM file, you might have to apply the latest changes by hand. Planet files are usually up to one week old; regional files may be up-to-date, but do not expect them to have exported at midnight. The germany.osm.pbf, for example, will be available today at about 03:00 or 04:00, but it has the state of the day before yesterday ca. 20:00. Therefore you will have to apply at least the latest two .osc change files to get the regional OSM file up-to-date.
Download the necessary change files from planet.osm.org/daily and put them into your osmupdate directory. Now, unpack all these .osc files (command gunzip) and apply them with osmconvert to your .o5m file. For example:
rm -f b.o5m ./osmconvert a.o5m -B=a.poly a.osc --out-o5m >b.o5m mv -f b.o5m a.o5m
If you do not want to update the .o5m file by hand, you can use the following script instead. It will automatically download and apply the changes of the last 8 change files.
rm -f b.o5m *.osc.gz OSCFILES=$(wget -O - http://planet.osm.org/daily/ |grep ".osc.gz" |sed s"/<a href=\"/\n/" |sed s"/\">/\n/" |grep -v "<" |grep ".osc" |tail -8 |sed s"/^/ http:\/\/planet.osm.org\/daily\//" |tr -d "\n") wget $OSCFILES gunzip *.osc.gz ./osmconvert a.o5m -B=a.poly *.osc --out-o5m >b.o5m mv -f b.o5m a.o5m
Daily Update
Two steps are necessary to perform the daily update: get the latest .osc file and apply the changes to the existing local .o5m file "a.o5m". The following commands will do this. A log file is written, to document every step, and the responsible user will be mailed if anything goes wrong.
Emails can only be sent if the appropriate packages have been installed on your system, i.e., if you are using an Internet server. Otherwise, remove the line which contains the command mail.
Create a new file named osmupdate.sh; use gedit as editor. In case you do not have a graphical environment, use nano or vi, for example. Then enter this contents (replace my_user_name with the name of your account on the machine and my_email@address.com with your actual email address):
#!/bin/bash
cd /home/my_user_name/osmupdate # (insert your user name here)
# rotate log and write new headline
mv upd.log upd.log_temp
tail -10000 upd.log_temp>upd.log
rm upd.log_temp
echo >>upd.log
echo Starting update script >>upd.log
date >>upd.log
# ensure that the last downloaded .osc is not too old
if [ "0"$(stat -L -c"%Y" a.osc 2>/dev/null) -lt $(date -d"yesterday 01:00" +"%s") ]; then
echo "Error: Did not download yesterday's OSM change file." >>upd.log
(echo "Did not download yesterday's OSM change file"; ls -lL) |mail -a "From: osmupdate" -s "No .osc file download yesterday" my_email@address.com
exit
fi
# get the latest .osc file
OSCFILE=$(date -u -d yesterday +%Y%m%d)"-"$(date -u +%Y%m%d)".osc.gz"
echo Latest changefile is $OSCFILE>> upd.log
date >>upd.log
t=0 # time we have waited, in minutes
while [ $t -le 720 ]; do # try max 8 hours to get the .osc file
rm -f a.osc
wget -O - http://planet.osm.org/daily/$OSCFILE 2>/dev/null | gunzip >a.osc
ls -l a.osc >>upd.log
date >>upd.log
if [ "0"$(stat -Lc%s a.osc 2>/dev/null) -lt 1000000 ]; then
echo "OSM change file "$OSCFILE" not yet available; will wait some minutes" >>upd.log
sleep 1260; t=$(expr $t + 21)
continue;
fi
if (! tail -4 a.osc |grep -c "</osm" >/dev/null); then
echo "OSM change file "$OSCFILE" without end tag" >>upd.log
echo "Will wait some hours and hope OSM team will fix it" >>upd.log
sleep 10860; t=$(expr $t + 181)
continue;
fi
t=0
break
done
if [ $t -gt 0 ]; then # timeout
echo "Error: No valid OSM change file "$OSCFILE"" >>upd.log
(echo "No valid changefile available: "$OSCFILE; ls -lL) |mail -a "From: osmupdate" -s "No .osc file today" my_email@address.com
exit
fi
# apply the .osc file to the .o5m file
rm -f b.o5m
./osmconvert a.o5m a.osc -B=a.poly --out-o5m 2>>upd.log >b.o5m
date >>upd.log
if [ "0"$(stat -c%s b.o5m 2>/dev/null) -lt 900000 ]; then
ls -l b.o5m >>upd.log
echo "Updated .o5m file too small (implausible)" >>upd.log
exit
fi
./osmconvert -t --out-o5m b.o5m
if [ $? -ne 0 ]; then
(echo "osmconvert reported an error."; ./osmconvert -t --out-o5m b.o5m 2>&1) |mail -a "From: osmupdate" -s "osmconvert error" my_email@address.com
exit
fi
mv -f a.o5m aa.o5m
mv -f b.o5m a.o5m
ls -l a.o5m >>upd.log
date >>upd.log
The new file osmupdate.sh must be made executable:
chmod ug+x osmupdate.sh
To get the update job started every day, you should enter its path to the cron directory. For example:
sudo echo -e "45 16 * * * my_user_name /home/my_user_name/osmupdate/osmupdate.sh 2>&1>/dev/null \n" > /etc/cron.d/osmupdate
(You will have to replace my_user_name with the name of your account on the machine, of course.)
Please ensure that the time of the day (16:45 in the example) is not close to the time the change files will usually be generated. The unit for the time you enter is your local time; the new planet change file is usually made available every night between 01:00 and 06:00, but UTC.
Benchmarks
(Please add comments.)