Import/Catalogue/Grotte-RAFVG
About
This page is about importing cave entrances dataset published by regione Friuli Venezia Giulia (RAFVG), Italy.
Dataset shall be adapted in order to generate OSM files suitable to be imported in planet.osm. It shall not be a blind import: source data shall be checked by mappers thru audit support maps.
The import is being discussed on the regional OSM mailing list. The import will be the result of consensus there.
Goals
This import aims to have a and updated set of cave entrances for regional territory. Cave entrances shall be filtered by presence of a physical marker in immediate surroundings.
Schedule
This import shall be perfomed on a regional (admin_level=4) base. Audit progress will be trackable in project page. Import size (~2500 POIs) should take 8 week to be accomplished.
Import Data
Background
Source dataset contains 8744 records, each defining a cave entrance; each record has a reference number assigned by Catasto Speleologico Regionale (CRS); a referece number can be shared by more cave entrances (if they give access to the same cave). After sampling some records, defined geo coordinates seem accurate. During audit process, few minor spatial errors may be detected; since OSM objects are considered authoritative in conflation process, in case of mismatch, event shall be manually recorded in object fixme tag.
Metadata
- Language: ita
- Date: 2021-05-04
- Subject: Ingressi Grotte
- URL: https://catastogrotte.regione.fvg.it/pagina/105/download
- format: Formato Shapefile interscambio GIS
Legal
- Licence definition page: Uso dati
- Data license: IODLv2 as defined in data source page above, addenda Allegato A.
Record format and tagging plan (draft)
According to wiki the following tagging shall be applied.
Name | Description:it | Description:en | Example | Notes | Tagged as |
---|---|---|---|---|---|
CATASTO_RE | codice CSR | CSR id | 3478 | cave:ref | |
NOME_PRINC | nome della grotta | cave's name | Mala Jama | name | |
NOME_INGR | nome dell'ingresso (se più di uno) | entrance name (if more than one) | Ingresso 2 | appended to cave:ref | |
QUOTA | quota dell'ingresso | entrance elevation | 108.00 | Trailing zeroes will be removed | ele |
SVIL_PLAN | sviluppo planimetrico | sprawl | 403.00 | Trailing zeroes will be removed | cave:length_plan
TBD cave:size |
DISLIVELLO | dislivello | elevation gain/drop | 24.00 | Trailing zeroes will be removed | cave:depth
TBD cave:size |
TIPO_INGRE | tipo di ingresso | entrance type | Orizzontale | description | |
MORF_INGRE | morfologia ingresso | entrance morphology | Galleria | description | |
TARGHETTA | presenza targhetta | marker presence | yes | cave:plate | |
url | url scheda del catasto | url to item in datasource archive | https://catastogrotte.regione.fvg.it/scheda/10-abc | derived from json dataset version | url |
Import Type
The dataset will be imported on a regional base (OSM admin_level=4). Prior to upload, osm preview file will be published and linked in this page to be manually checked by local teams.
Data Preparation
The data is presented as ESRI shp file in a collection of point elements, one for each cave entrance. Input dataset shall be converted from ESRI shp to csv via Qgis and feeded to Openrefine.
Refining
Prior to osm>json conversion, some issues require refining operations (OpenRefine), documented herein. A summary of actions performed thru OpenRefine:
- description built concatenating TIPO_INGRE, MORF_INGRE
- some char case fixing and decimal removal
- local id (required by conflation process)
- filtering by TARGHETTA field
Exporting
Conflator input requires json format. Dataset conversion to json is performed thru OpenRefine template documented herein. Further validation on output json files can be performed thru jsonlint (npm -g install jsonlint).
Up to 2.8 version, Openrefine doesn't manage null values; workaround to remove lines containing nulls:
pi@raspberrypi:~/OSM sed -i -e '/ : null/d' <Openrefine-output-file>.json
Conflation
Conflation is performed by OSM Conflator. Objects tagged as "natural"="cave_entrance" will be extracted from OSM by a specific overpass-turbo query. Matching OpenStreetMap data within a range is merged and tags will be added or proposed for change accordingly to conflator parameter file. Non-matching OSM objects will be marked with the note tag: "this cave entrance has not matches with RAFVG dataset within a 20m radius", for future surveys.
Json file resulting from conflation shall be community revised on an audit map. Upon audit completion, an osm file shall be generated by further conflator run.
Conflator output example
pi@rpi3: conflate -i 2021-05-04.json -v -c preview-2021-05-04.json --osm cave_entrance.osm profile.py
10:30:46 Loading profile <_io.TextIOWrapper name='profile.py' mode='r' encoding='UTF-8'>
10:30:47 Dataset points duplicate each other: 178 and 179
10:30:48 Dataset points are too similar: 365 and 366
10:30:48 Dataset points are too similar: 408 and 409
10:30:48 Dataset points are too similar: 478 and 479
10:30:48 Dataset points are too similar: 787 and 788
10:30:51 Found 14 duplicates in the dataset
10:30:51 Read 2573 items from the dataset
10:30:51 Downloaded 1106 objects from OSM
10:31:02 Matched 638 points
10:31:02 Removed 13 unmatched duplicates
10:31:02 Adding 1922 unmatched dataset points
10:31:03 Deleted 0 and retagged 468 unmatched objects from OSM
Conflator re-run after audit
pi@rpi3: conflate -i 2021-05-04.json -a audit_FVG-CAVES.json -o caves-ready.osm --osm cave_entrance.osm profile.py
TBD
Upload
Dedicated upload account
The account cascafico will be used to upload community revised .osm files.
Changeset Tags
Changeset will be tagged with:
- source=Regione Friuli Venezia Giulia
- source:date=yyyy-mm-dd (as defined in source dataset)
- source:license=IODLv2
- type=import
- url=https://wiki.openstreetmap.org/wiki/Import/Catalogue/Grotte-RAFVG
Team Approach
Import will be managed by the following OSM users:
- Cascafico
Workflow
Step by step operations:
- dataset download
- shp to csv OpenRefine conversion
- OpenRefine operations
- OpenRefine json export
- run conflator
- audit map announcement & publication
- wait for community validation
- conflation re-run
- OSM candidate publication
- Upload changeset in OSM
In case of import problems, changeset involved will be reverted using proper reverter
OSM Candidate file
QA
In case some problems will be detected after upload:
Widespread:
- TBD
Limited:
- TBD