AND Data/Roadmap-integration-NL
Saturday 28 and sunday 29 july the first hacking-party was held in Amsterdam. During the party the first integration-roadmap was drafted.
This roadmap is open for discussion.
Assumptions
Psychological
The current OSM map is the fruit of a lot of time and effort of dedicated volunteers all over the country (and beyond). We have to keep that in mind when integrating the AND dataset into OSM.
Therefore the people who have contributed to the NL-map have to be contacted in an early stage to inform them of our plans and ask their “emotional bound” towards their contribution. And also ask their support with the integration of the data. A message on the mailing list will not be sufficient since not all contributors are member of the NL and/or dev mailing list.
The integration of the AND data should be completed asap. At the moment the active mappers in NL don't know what to do since it is not clear what still needs to be mapped. The longer the integration will take the more mappers could lose interest in the project.
Technical
Roughly the AND data can be described as complete but out-of-date. Whereas the OSM data can be described as up-to-date but incomplete. The challenge will be how to integrate those two datasets.
There are two aspects when it comes to the technical part of the integration: what and how.
What
- Basically, we would like to get as much of the AND data into OSM.
- Concerning the issues how to map the AND data model with the OSM data model: that is discussed in the wiki AND_Data/AND-tag-mapping-to-OSM.
- Not only the roads, but also the coastline, land use, routing info (e.g. turning restrictions)
- The following ways and areas will be subject to migration: motorway(_link), primary(_link), secondary, tertiary, residential, undefined, pedestrian, land use (available for NL are: Airports (National and International), City built-up area, Cemetery, Forest/woodland, Golf course, Island, Industrial area, Ocean/sea, Park/garden, Water boundaries)
- Footways and cycleways will not be touched by the integration since the AND dataset is incomplete on this matter. There is no proper way to distinct footways, cycleways and pedestrian ways. Therefore the AND data will be handled as pedestrian ways. A quick scan of the data confirms the decision that ways with tags “RD_TYPE = 6” (other road) and “RD_TONNAGE = -1” will most likely be pedestrian.
It would be very helpful when the discussion about how to add the concept of turning restrictions into the OSM data model will soon result in an agreed solution.
How
Several migrating scenarios have been discussed. Even though NL is not a very big country, integrating the AND data by hand was dropped as a feasible procedure in very early stage. However, it is not sure that this decision was made due to the huge amount of data, or that the script-kiddies also wanted a challenge ;-) Anyhow, some kind of scripting will be needed. Also, we want to keep the procedure relatively simple and straightforward. All kinds of exceptions will only make the chance of errors more likely. And (not the least) will only slow down the integration process.
The base strategy to integrate the AND data will be to replace the current OSM data with the AND data and restore (by hand) the OSM data on the places where the AND data is out-of-date. For the most parts of the country this will be a reasonable way to work. However, the area around the city of Assen should be an exception. The OSM data is far more detailed and more accurate than the AND data. At the moment this is the only area (of substantial size) we know in which the OSM data should not be replaced with the AND data. This should be verified with the NL-mappers.
Migration plan
Prerequisites
- Make as much use of rectangular boxes to define an area. This will make querying the database easier (= faster).
- Define “the Netherlands” as a set of rectangular boxes. This can be coarse depending on the actual data at the country borders
- Define the exception areas in which the OSM data prevails over the AND data as a set of rectangular boxes. OSM ways that are crossing the borders of the areas will not be deleted and AND ways that are crossing the same borders will be imported. This will cause an overlap. An overlap which is needed for a proper way to connect both datasets in the aftermath.
- Make a separate database with the converted AND-data
- Make a separate database with the pre-migration version of the OSM-NL database
Migration-day!
- Disable the editing API interface
- Export the NL-data to the separate database (as backup) (back up the informationfreeway tiles too?)
- Delete the ways, segments and nodes that have the above mentioned tags in defined NL-area but not being in the defined exception areas
- Import the AND data not being in the defined exception areas
- Enable the editing API interface
Aftermath
- Reconnect the ways along the borders of the NL-area (with Belgium and Germany) and along the borders of the exception areas.
- Check the new NL-map and restore OSM data if needed. The separate AND database and OSM-backup can be used. (It would be handy to have a rendered OSM-back up too, like informationfreeway now, for easy comparison. Especially if it is easy to switch back and forth)
- Uncork the champagne bottle.
Of course: we mustn't forget to issue a press release after the new map is also visible on the Mapnik and Osmarender tiles set.
Programming tools
The scripts for managing this process can be found in: http://svn.openstreetmap.org/applications/utils/import/and_import/
The conversion programs can be found in: http://svn.openstreetmap.org/applications/utils/import/and2osm/
Wish-list features in JOSM
- Merging tags between layers (e.g.: select a way in layer1 and layer2, merge tags of the ways)
- Copying selected elements from one layer to another
Existing data to remove in favour of AND
Data that is to be removed from OSM and replaced with AND data:
- roads (highway=motorway|motorway_link|trunk|trunk_link|primary|primary_link|secondary|tertiary|unclassified|residential|pedestrian|service)
- railway (railway=rail|light_rail)
- places (place=*)
- natural=coastline|water
- waterway=riverbank|canal|river
In addition the following will be removed as part of the process:
- Unwayed segments
- Nodes that have no useful tags
All segments not needed to support ways will be deleted irrespective of their tags.
Protected areas
There are a number of people who feel that their areas are more complete than the data provided by AND. For these people we have the concept of a "protected area". This is basically an area where no changes will take place as part of the import. They are indicated in the OSM server as an area (closed way) with the tag "nl:protected_and_import=yes". This means any nodes/segments/ways completely within the area will be unchanged, however:
- Any segments that cross from the protected area to an unprotected area will be deleted.
- Any ways that cross from the protected area to an unprotected area will be truncated at the last node within the area.
Since there will be roads crossing these boundaries, it is *strongly* recommended you insert a new node into those roads just before it hits the boundary, to preserve your data as much as possible.
The boundaries themselves will be preserved so that later during merging the boundary between old and new data will be clear. Please use them sparingly (not for trivial features) and make sure you include the ID of the way next to your name below so we can be sure we found it.
What to do with current data
A lot of mapping has already been done in NL. We would like to know how the mappers would like their contributions to be treated during the integration process
This is the place where you should put your viewpoint how your contributions should be handled. If nothing is mentioned here, we presume that the data can be replaced with AND's. In case you would like your data not to be replaced by AND data, you need to mark in the system exactly what you want protected.
To do this create an area around what you want to protect and mark the way with the tag "nl:protect_and_import=yes" and record the ID of the way below next to your name so we know how to find it. If possible also include a note with your userid. To understand how to protect the area, look above at the section relating to protected areas.
!Note! The uploaded gpx tracks will not be deleted by the integration process.
Mapper | Area | What to do with it |
---|---|---|
User:Annemieke | Uithoorn | replace with AND data |
User:Ante | Amsterdam, Rotterdam, Den Haag | replace with AND data |
User:ZoranKovacevic | Amsterdam | replace with AND data |
User:Osm@floris.nu | Heiloo and a lot of N-ways in NH | replace with AND data |
User:Toffehoff | Assen | Keep OSM data - integration by hand (5134201) |
User:Toffehoff | Everything but Assen | replace with AND data |
User:Myckel | Leiden | replace with AND data, I'll keep also a local copy of OSM data |
User:Myckel | Valkenburg aan de Geul (e.o.) | replace with AND data, I'll keep also a local copy of OSM data |
User:Rubke | Maastricht / Zuid-Limburg | replace with AND data, I'll keep also a local copy of OSM data |
User:ChristW | Eindhoven, small parts of other towns, usually cycling or hiking | replace with AND data |
User:Jaapandre | Losser | replace with AND data |
User:IvarClemens | Boxtel | replace with AND data and move existing data to a .osm for manual integration |
User:Freek | Nuenen | keep OSM data, will merge manually where necessary. Polygon way id: 5111828 |
User:Freek | TU/e | small area, but OSM data is very accurate, will merge manually |
User:Fopper | Enschede (south-east) | replace with AND data (Data of north-west is not mine |
User:Marc | Susteren | replace with AND data |
User:Marc | Echt | replace with AND data |
User:Remco | Bussum, Hilversum, Huizen, etc ('t Gooi) | replace with AND data |
User:Paulb | Groningen | replace with AND data, please provide an easy way to merge (parts of) osm-data |
User:Mdeen | Helden | replace with AND data |
User:Kleptog | Delft | replace with AND data |
User:Joove | Oegstgeest | replace with AND data |
User:Rduivenvoorde | Haarlem | replace with AND data |
User:Kvangend | Eindhoven Zuid | replace with AND data |
User:Ervano | Bladel Hapert Eersel Valkenswaard Geldrop | replace with AND data |
... | ... | ... |
Borders, coastlines, rivers
See: AND_Data/Borders
TODO's
- User:Toffehoff - mail contributors of the NL-map (asap)
- User:ZoranKovacevic - Coordinate hardwareplan: getting sufficient hardware resources for testing purposes
- User:JeroenDekkers - Check with XS4ALL for hardware sponsoring
- User:Toffehoff - Check with @Home for hardware sponsoring
- User:Osm@floris.nu - Optimizing conversion script (together with User:Marc)
- User:JeroenDekkers - Set up a testing environment
Hackingparty August 18th 2007
Location: AND Automotive Navigation Data (AND) in Rotterdam. Goal: succesfull migration testrun. More info here
The following should be tested during this hackingparty:
- Conversion Check if the conversion is according AND_Data/AND-tag-mapping-to-OSM
- Remove current OSM NL data With special attention to the country borders and exception areas. Also check if only the intended data is removed (e.g. aminities should not be removed)
- Import new data Check if the data in the exclusion areas are not imported.
- Connecting the borders Is there enough overlap between the current OSM data and the new AND data.
Available servers:
- AND: Test-server with the latest planet.osm installed for actual testing purposes
- osm.nl: server with complete converted AND-data set
- Henk: has a computer with him that can also be used as osm-server (if needed)
At the end of the day we should have a detailed migrationscript with a rollback scenario in case something unforeseen goes wrong during the migration.