Automated edits/JurgenG/European municipality websites
edit: will not run script - too much complexity and uncertainty. Will share CSV files with results with community so local teams can validate and implement.
Main idea
For a different project, I'm trying to get an overview of all official municipal websites in Europe. I thought it would make sense to add the newly found website to OSM (and fix broken ones).
To prevent breaking things (again), this will happen after discussing my code with experienced OSM-devs.
Here is the current code: https://gitlab.com/jurgeng/osm-city-updater/
Look at my profile to contact me or use the GIT-flow on the GitLab repo.
Why is this relevant/important?
Often, municipalities have an official website containing essential information about this municipality. While using OSM as source, I found:
- often the website was not in the metadata of the municipality
- occasionally, the website stated in OSM was dead
- regularly the website redirected to a new URL, so the one on record was not the canonical site
At first, I started uploading websites by hand - one by one. But this was a slow and cumbersome process.
How to determine the website?
There's a clear plan with an order of priority:
- Check if there is already a "website" field on OSM for this municipality
- if found: do HTTP(S) call to this url
- if it returns HTTP200, it's considered the correct site and nothing happens
- if it returns HTTP30x, pick up the redirect URL and check if this one resolves (if not a HTTP200, continue with redirect URL) ⇒ assumed to be correct → no revision needed
- if it returns something else (40x or 50x) → start search for other possible site
- if "website" doesn't exist → start search for other possible site
- if found: do HTTP(S) call to this url
- Check if there is a website known on the Wikipedia entry for this location (perform same routine as for 1)
- If a Wikipedia URL is found (HTTP 200), assume this one to be the canonical one → add this URL to the OSM metadata for this municipality ⇒ assumed to be correct → no revision needed
- If no valid Wikipedia URL is found, continue search for other possible domain
- Based on common domains already found in OSM dataset, potential patterns get deduced (will happen on a per country basis)
- (TODO) Certain domains (e.g. TLD subdomain like .gov.uk) will be assumed reliable when found
- Certain domains (e.g. cityname.tld or cityname.net) will be uncertain
- Some are manually validated prior to submission (visit the site, see that it is the official website) → add this URL to the OSM metadata for this municipality ⇒ assumed to be correct → no revision needed
- Some are not manually validated prior to submission (insufficient knowledge of the language) → add this URL to the OSM metadata for this municipality ⇒ uncertain to be correct → revision required
What could possibly go wrong (or did go wrong already)
- Did go wrong: A first iteration of the script committed website information to municipalities in Belgium, but forgot to also include the existing metadata. The consequence was that all data (except for the new website info) was removed from these municipalities. The mistake in the data got fixed by M!dgard. The code too, is fixed by now.
- Could go wrong: Certain brands could own a domain name instead of the city (e.g. chimay.be or spa.be)
- This is why manual validation of domains/sites is important
- Could go wrong: Certain domains could be squatted
- idem: manual validation
Consultation prior to execution
This script was created - not knowing about the policies about Automated Edits. This was brought under my attention here:
https://www.openstreetmap.org/changeset/162205669#map=13/50.68754/3.29006
Timing and planning
The idea is to perform this script once for every country - on a basis of available time and post-approval by community
Feel free to fork the code and pick this up for your own country.
Countries done
- Belgium: 2025-02-06 (thanks to M!dgard for fixing my screw-up and retaining the added data)
- ...
Licensing
As this is just indexing of factually existing websites (and not importing data from a pre-existing source), there is no licensing discussion about the legality of the content.
Sources and formats
Source for cities to be indexed
- Albania
- Andorra
- Austria: https://www.statistik.at/en/statistics/population-and-society/population/population-stock/population-for-the-fiscal-equalisation
- Belarus
- Belgium: https://www.bpost.be/nl/postcodevalidatie-tool
- Bosnia and Herzegovina
- Bulgaria
- Croatia
- Czech Republic: https://apl2.czso.cz/iSMS/cisdata.jsp?kodcis=43
- Denmark: https://data-science.dk/datasat/gis/valg/danmarks-kommune-inddeling-vektor/
- Estonia
- Finland: https://stat.fi/en/luokitukset/kunta/kunta_1_20250101
- France: https://www.regions-et-departements.fr/communes-francaises
- Germany: https://www.destatis.de/DE/Themen/Laender-Regionen/Regionales/Gemeindeverzeichnis/Administrativ/05-staedte.html
- Greece: https://geodata.gov.gr/en/dataset/katalogos-demon
- Hungary
- Iceland
- Ireland
- Italy: https://www.istat.it/classificazione/codici-dei-comuni-delle-province-e-delle-regioni/
- Latvia
- Liechtenstein
- Lithuania
- Luxembourg: https://lustat.statec.lu/vis?lc=en&pg=0&fs%5B0%5D=Topics%2C1%7CTerritory%20environment%20and%20energy%23A%23%7CTerritory%23A1%23&fs%5B1%5D=Surface%20Area%2C0%7CThe%2012%20cantons%20and%20102%20municipalities%23L04%23&fc=Surface%20Area&df%5Bds%5D=ds-release&df%5Bid%5D=DF_X010&df%5Bag%5D=LU1&df%5Bvs%5D=1.0&dq=..A&ly%5Brw%5D=SURFACE_AREA&ly%5Bcl%5D=SPECIFICATION&vw=tb
- Malta
- Moldova
- Monaco
- Montenegro
- Netherlands: https://www.cbs.nl/nl-nl/onze-diensten/methoden/classificaties/overig/gemeentelijke-indelingen-per-jaar/indeling-per-jaar/gemeentelijke-indeling-op-1-januari-2024
- North Macedonia
- Norway: https://www.kartverket.no/til-lands/fakta-om-norge/norske-fylke-og-kommunar
- Poland
- Portugal
- Romania
- San Marino
- Serbia
- Slovakia: https://slovak.statistics.sk/wps/portal/ext/Databases/REGPJ/!ut/p/z1/jZDNDoIwEISfxSfoQPk9FiOl2mALUrAXw8EQEkUPxueXoFeBuW32m9nMEksaYof23Xftq38M7W2czza4GKmiJHEYjloFYOKQe2Us4JaU1BOgQ_EForzYQZyY4sXec-D5xI5rprUupTHgxk0hqMORVxWQhj__DGDX3N9ylnmhBCLJfQiWVUWsKQWj6_z4I4Z1_hnAzsfXxE7IXIOlDLv05PI6kOe9GtWgV93mA6Qi5tw!/dz/d5/L2dJQSEvUUt3QS80TmxFL1o2X1ZMUDhCQjFBME9SQTQwSUZTVUNGTVMyRzgx/
- Slovenia
- Spain: https://datos.gob.es/en/catalogo/a09002970-municipios-de-espana
- Sweden: https://skr.se/skr/tjanster/kommunerochregioner/kommunerlista.1246.html
- Switzerland: https://www.agvchapp.bfs.admin.ch/fr
- Turkey
- Ukraine
- United Kingdom: I'm giving up on this one... now I understand why a UK citizen is so flegmatic... if you can wrap your brain around this, you can handle anything.
URL templates to be tried
If no website is found on OSM or Wikipedia, a few additional tries can be tapped into. These will be generated based upon the city slug (= transliteration from native city name into valid website characters)
Belgium
- https://{city_slug}.be (requires validation - manually validated)
Netherlands
- https://{city_slug}.nl (requires validation)
France
- https://{city_slug}.fr (requires validation)
- https://mairie-{city_slug}.fr (requires validation)
- https://ville{city_slug}.fr (requires validation)
- https://ville-{city_slug}.fr (requires validation)
- https://agglo{city_slug}.fr (requires validation)