Import/LINZ Topo50 Continuation

LINZ Main Page	LINZ Building Import	LINZ Address Import	Missing Streets	LINZ Place Name Import

LINZ Data Import

Author:

Kylenz

License:

MIT License

Platform:

Web

Status:

Active

Version:

1.0.0 (2021-03-09)

Language:

multiple languages

Website:

https://osm-nz.github.io/RapiD

Source code:

osm-nz/linz-address-import^GitHub

Programming language:

TypeScript

Modification of RapiD to compare and update OSM data based on LINZ data

Features

Feature	Value
Map Display
Display map	yes
Map data	?
Source	?
Rotate map	?
3D view	?
Shows website	?
Shows phone number	?
Shows operation hours	?
Routing ?
Navigating ?
Tracking ?
Monitoring
Monitoring	?
Show current track	?
Open existing track	yes
Altitude diagram	?
Show POD value	?
Satellite view	?
Show live NMEA data	?
Show speed	?
Send current position	?
Editing
Add POIs	yes
Edit / Delete POIs	yes
Add way	yes
Edit geometries	yes
Edit arbitrary tags of existing OSM objects	yes
Edit relations	yes
View notes	?
Create notes	?
Edit notes	?
Work offline	?
Support imagery offset DB	?
Upload to OSM	yes
Rendering ?
Accessibility ?

Background

Data from LINZ's Topo50 maps was imported into OSM between 2009 and 2016. Not all the datasets were imported during this time. This page documents the process used to import data since 2021. The main wiki page contains details about the tagging and the current status.

Source data & source code

Source code for the backend (data processing part): osm-nz/linz-address-import^GitHub and osm-nz/place-name-conflation^GitHub.
Source code for the frontend (fork of RapiD) is available from osm-nz/RapiD^GitHub.

This project is just a small modification of the LINZ Address Import system. Most of the code is the same.

How it works

Every imported feature has the tag ref:linz:topo50_id=* or ref:linz:place_id=*, which is the unique UFID used by LINZ. This allows the data to be easily conflated, just like NZ Street Addresses. This works as follows:

A script runs daily during the import, which downloads the OSM Planet file (geofabric lets you download just Oceania + Antarctica).
Every occurrence of the tag ref:linz:topo50_id=* and ref:linz:place_id=* is extracted from the OSM Planet file
The list of topo50_ids to ignore is downloaded from the Google Sheet.
Each incomplete LINZ layer is processed: the features that are not in OSM nor in the Google Sheet are converted to geojson.
The geojson files are split up into geographic regions depending on its size:
- Datasets with very few features are crudely split into 8 large areas (roughly equivalent to NZ Regions)
- Datasets with a moderate number of features are split into 33 areas according to this map (roughly the size of NZ Districts)
- Datasets with a large number of features are split into 'sectors', such as K15. The mainland is divided into 26 columns (A-Z) and 26 rows (1-26). Sectors span roughly 0.5 degrees of latitude and 0.5 degrees of longitude.
The segmented geojson files are uploaded to the CDN, along with the geojson files from the LINZ Address Import.
--
The list of datasets that were uploaded can be seen in the fork of RapiD and the JOSM download page.
When you select a dataset, it becomes 'locked' for an hour.
If you upload some- or, all of- that dataset, it becomes 'locked' until the next daily conflation (step 1).
If you use RapiD, and click 'Ignore this feature', it gets added to the Google Sheet from step 3. Otherwise you would get prompted to add that feature forever. \

This process is a small part of the pipeline for LINZ Addresses. This page has a flowchart which shows the entire system.

Continuing partially completed layers

Some layers were partially imported between 2009 and 2016, but without the ref:linz:topo50_id=* tag. This makes conflation more difficult.

However, it is possible to re-continue these layers. This is still a work in progress.

Method A:

Use overpass-turbo to identify which parts of the country are already imported. Define these areas as bboxes
Update the code for that layer to skip features within those bboxes.

Method B:

The data that is already in OSM is extracted using overpass-turbo, and downloaded as geojson.
We loop through every feature in OSM, and find the closest feature in the LINZ data.
- If the nearest LINZ feature is within 3 metres of the OSM feature, we add the tag ref:linz:topo50_id=* to the OSM feature. This is done in bulk using Level0 (exact steps tbc)
- The rule above is more complicated for ways, areas, and multipolygons. We check if 80% of the nodes in OSM are within 3 metres of the a node in that LINZ feature.
The next day, the conflation process will pick up the existing features in OSM, since they now have the ref:linz:topo50_id=* tag.

How do I contribute?

The tool is available here, anyone can import data. If you prefer using JOSM, you can download osmChange files from here (however, this is not the recommended option)

Potential issues

This table will be updated as the project progresses.


Issue	Mitigation
Duplicate data being imported	The fork of RapiD has an added feature to prevent duplicate addresses being imported based on the `ref:linz:topo50_id=*` tag. This conflation happens in real-time, in the browser
Multiple people editing the same dataset at the same time	Users will be presented with a warning if someone else is/was editing that dataset in the last hour
Duplicate nodes when importing data that abuts existing features	RapiD will intelligently re-use nodes that are already in OSM. If this is not good enough, iD#8671 will make it easy to join abutting ways.
Imported rivers/roads are disconnected from existing features	^
Imported rivers cross roads	RapiD's validator will warn you about this
LINZ's data uses way too many nodes at corners	We use the Douglas-Peucker algorithm to simplify the geometry during processing. The original import did not do this, so if a way imported after 2020 abuts an way imported before 2020, there may be a gap where the ways don't abut.
Ways with over 2000 nodes break OSM	If an Area has >2000 nodes, it gets split into a MultiPolygon with multiple outer ways, each with at most 495 nodes. If a MultiPolygon has >2000 nodes in one of it's ring, that ring gets split into segments with up to 495 nodes each.
LINZ's data is out of date	This hasn't been an issue yet, but mappers can press `Ctrl`+`B` to cycle through LINZ Aerial Imagery (2017), Maxar (2021), and the LINZ Topo50 map.
No aerial imagery available for parts of the Ross Dependency	You need to use the standard OSM-Carto tileserver as your background imagery, and reference a separate map like LINZ's Ant50 series.
Merging in new features destroys the OSM object's history	Fixed in iD#8708
Working with hex colours in our custom iD presets is confusing	Fixed in iD#8782
Duplicate hydrographic data due to overlapping charts	We only consider data from the most detailed chart available for that area. Features that cross multiple charts will be flagged and manually merged in RapiD.
Hydrographic data crosses the antimeridian	We download the OSM planet extract in two chunks: west and east of the antimeridian. And we split all datasets into east/west of the antimeridian.
Some lines and areas cross the antimeridian	For lines, we will use `type=multilinestring`. For areas, we will use a `type=multipolygon` with `closure_segment=yes` on the virtual boundaries
A small number of hydrographic features reference the legend of the nautical chart. These legends are not available from LDS.	We will still import these features, with the tag `description=see XXXXXX.txt`. If these descriptions are made available, we can easily add them to the features.
The seamark tagging schema is very complicated for mappers	We have created our own iD presets and rendering styles for the most common seamark tags
Some obvious data is missing (e.g. fairways, ski access lanes, coast guard stations, surf-life-saving bases, patrolled beaches)	This data is managed by the local harbourmaster, and isn't included on nautical charts. We will create iD presets to make it easy to map these features.
LINZ's Topo50 data generally does not associate topographic features with names.	Names are downloaded as a separate layer from the NZGB dataset. This means there will be two layers in the tool (e.g. 'Peaks' and 'Named Peaks')
`type=multilinestring` is not a first-class data type in OSM and is not supported by any known software.	No solution, there is no other way to represent a discontiguous linear feature.
MultiPoints (`site` relations) are not a first-class data type in OSM and aren't supported by the planet-extraction software we use.	No current solution, these features are skipped by the conflation tool (E.g. Redwood Station Redwood Station)
Hydrographic data is some areas is completely missing	This issue is unresolved and still under investigation

Import/LINZ Topo50 Continuation

Contents

Background

Source data & source code

How it works

Continuing partially completed layers

Method A:

Method B:

How do I contribute?

Potential issues

Navigation menu

Import/LINZ Topo50 Continuation

Background

Source data & source code

How it works

Continuing partially completed layers

Method A:

Method B:

How do I contribute?

Potential issues

Navigation menu

Search