Wikipedia Link Improvement Project
This page documents ongoing tasks to fix Wikipedia and Wikidata related tags. Most queries here use Wikidata+OSM SPARQL query service. See also Quick fixes.
Please only edit Wikipedia and Wikidata links in areas where you have knowledge to back up the edit. Don't blindly follow instructions if you have no knowledge about the region you are editing in, or the kind of object you are editing. |
Wikipedia links in the "website"/"url" key
#defaultView:Map
SELECT ?osmId (IRI(?url) as ?wp) ?loc WHERE {
{ SELECT ?osmId ?url ?loc WHERE {
?osmId osmt:url ?url ;
osmm:loc ?loc .
FILTER( contains(str(?url), 'wikipedia.org') )
} }
UNION
{ SELECT ?osmId ?url ?loc WHERE {
?osmId osmt:website ?url ;
osmm:loc ?loc .
FILTER( contains(str(?url), 'wikipedia.org') )
} }
}
Often users add links to Wikipedia in website and url tags. They should be moved to wikipedia + wikidata instead. To fix:
- Use the above query to view and fix each object
- Use TagInfo, search for value "wikipedia" in website values and url values
- Use Overpass turbo to search for “website ~ /wikipedia/i” and “url ~ /wikipedia/i”
Missing Wikidata tags
#defaultView:Map
SELECT ?osmId ?wp ?loc WHERE {
# Limit to nodes that have a tag called "wikipedia", and get its location
?osmId osmt:wikipedia ?wp ;
osmm:loc ?loc ;
osmm:type 'n' .
# At the moemnt, the "#" symbol is incorrectly encoded as %23. It will not be encoded in the future
FILTER( !contains(str(?wp), '%23') )
# Must not have Wikidata tag
MINUS { ?osmId osmt:wikidata ?wd . }
}
iD editor automatically adds wikidata tag when a user adds wikipedia field. In JOSM, wikidata tag can be added with Data/Fetch Wikidata IDs command using Wikipedia plugin. These objects can be easily found with Overpass turbo using [wikipedia][!wikidata]
query. There are several reasons why the wikidata tag may be missing:
- In iD, user added wikipedia tag using "tags" instead of "fields" section. In JOSM, user forgotten to use Fetch IDs.
- Using JOSM, use fetch IDs command in the data menu.
- The wikipedia tag is incorrect, either because the title was entered incorrectly, or because it was deleted.
- Find a Wikipedia title about the object, possibly in a different wiki language, or delete wikipedia tag.
- The Wikipedia page exists, but there is no corresponding Wikidata entry.
- Check if there is an article about this exact object in another Wikipedia language. If exists, link both Wikipedia articles using "edit links" in the list of languages on the left, and re-fetch.
- If not, create a new Wikidata entry. You should always add at least one label, description, "instance of" statement, and a link to the Wikipedia article. Save and re-fetch.
- You can also use OSM ↔ Wikidata (https://osm.wikidata.link/) tool for matching wikidata and OSM objects
Mismatching wikidata and OSM name tags
Any OSM feature that gets linked to a wikidata item should ideally have the same or very similar name as they refer to an identical geographical feature. Any mismatches in the name might indicate a potentially incorrect wikidata tag. One can review a list of recent wikidata tags on OSM with a mismatched name using OSMCha:
- Review changes with a mismatched wikidata name: by new users, by anyone
Mismatching wikidata and wikipedia tags
Both wikipedia and wikidata tags should consistently point to the same thing. wikidata tag must always point to the Wikidata entry that links to the same Wikipedia title as stored in the wikipedia tag, except if Wikidata has a more precise entry, which corresponds better to the OSM object than the Wikipedia article does. In some cases, wikipedia tag points to a "redirect page", whose target in turn is part of the correct Wikidata entry. While this is OK, the OSM SPARQL service does not store such informatiton, thus producing errors. It is better to fix wikipedia tags to point to the actual articles to help with quick verification.
Links to Wikipedia pages about multiple objects
Frequently, there is no Wikipedia article about the specific OSM object, e.g. a church, yet there exists a Wikipedia page that mentions the object. This page could either be a table or list of all churches in the area, or it could be a page about a town, with a section of the article dedicated to the church. In some cases, it could be a list of different concepts with the same name (disambig page, see disambig section below). In any of these cases, do not use wikipedia tag. Instead, use related:wikipedia
(TBD!), and no wikidata tag at all.
Links to disambiguation pages
#defaultView:Map
SELECT ?osmId ?wdLabel ?wd ?wp ?loc WHERE {
# Limit to subjects that have a tag called "wikidata", and show its location
?osmId osmt:wikidata ?wd ;
osmm:loc ?loc .
# ?wd must be an "instance of" a disambiguation page, or an instance
# of some type, which itself is a (sub-)*subclass of a disambig page.
?wd wdt:P31/wdt:P279* wd:Q4167410 .
OPTIONAL { ?osmId osmt:wikipedia ?wp . }
# Pick the first available language for the wikidata entry (creates ?wdLabel value)
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,de,fr,it,pl,ru,es,sv,nl" . }
}
A disambiguation page is a page that lists multiple meanings of the same term. wikipedia and wikidata tags should never link to such pages. These items can be easily found by using ?wdId wdt:P31/wdt:P279* wd:Q4167410
query. In some rare case, Wikidata entries might have been incorrectly marked as disambiguations, and should be fixed (set proper "instance of", and remove a few main disambig descriptions). For all other cases, either find the right Wikipedia/Wikidata values, or remove them if there is no such entry. Having a link to disambig page has no value, with the possible exception of related:wikipedia
tag as described above.
Links to list pages
#defaultView:Map
SELECT ?osmId ?wdLabel ?wd ?wp ?loc WHERE {
# Limit to subjects that have a tag called "wikidata", and show its location
?osmId osmt:wikidata ?wd ;
osmm:loc ?loc .
# ?wd must be an "instance of" a list page, or an instance
# of some type, which itself is a (sub-)*subclass of a disambig page.
?wd wdt:P31/wdt:P279* wd:Q13406463 .
OPTIONAL { ?osmId osmt:wikipedia ?wp . }
# Pick the first available language for the wikidata entry (creates ?wdLabel value)
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,de,fr,it,pl,ru,es,sv,nl" . }
}
Similar to disambiguation pages, lists can be found using ?wdId wdt:P31/wdt:P279* wd:Q13406463
query, and should be fixed to use the related:wikipedia
tag, and no wikidata tag.
Links to page sections using a hash symbol
If wikipedia tag contains a "#" (a link to a page section), most likely it should also not use wikipedia tag, but instead use the related:wikipedia
tag, and no wikidata tag.
Links to Concepts, Brands, Subjects, Networks
Brands
#defaultView:Map
SELECT ?osmId ?location ?bwd ?bwdLabel ?bwdDescription WHERE {
# Subquery finds brand:wikidata IDs used more than 10 times
{
SELECT ?bwd (count(*) as ?count) WHERE {
?o osmt:brand:wikidata ?bwd .
}
group by ?bwd
having (?count > 10)
}
# Find OSM objects where wikidata tag is one of the common brand:wikidata IDs
?osmId osmt:wikidata ?bwd ;
osmm:loc ?location .
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,fr,ru,es,de,zh,ja". }
}
#defaultView:Editor
SELECT
?id ?loc
(CONCAT('Moving ',
if(!bound(?bwdLabel), '', ?bwdLabel),
' from wikipedia to brand:wikipedia') as ?comment)
(osmt:wikipedia as ?t1)
?v1 # unbound, which means it will be deleted
(osmt:wikidata as ?t2)
?v2 # unbound, which means it will be deleted
(osmt:brand:wikipedia as ?t3)
(if(bound(?existingBwp), ?existingBwp, ?existingWp) as ?v3)
(osmt:brand:wikidata as ?t4)
(?bwd as ?v4)
WHERE {
# restrict to just a few brands. Comment it out to search all
# VALUES ?bwd {wd:Q65310 wd:Q24933790 wd:Q1684639}
# Subquery finds brand:wikidata IDs used more than 10 times
{
SELECT ?bwd (count(*) as ?count) WHERE {
?o osmt:brand:wikidata ?bwd .
}
group by ?bwd
having (?count > 10)
}
# Find OSM objects where wikidata
# is one of the common brand:wikidata IDs
?id osmt:wikidata ?bwd ;
osmm:loc ?loc .
OPTIONAL { ?id osmt:wikipedia ?existingWp. }
OPTIONAL { ?id osmt:wikipedia:brand ?existingBwp. }
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,fr,ru,es,de,zh,ja".
?bwd rdfs:label ?bwdLabel .
}
}
As described in Wikidata proposal, there are many cases when Wikipedia/Wikidata may be about the general concept, and not the specific object. For example, a McDonald's restaurant should not link to the Wikipedia McDonald's article because the article is about the brand, not this specific restaurant. Instead, OSM object should use brand:wikipedia
& brand:wikidata
tags. The brand:wiki...
tag should also be used for anything brand-related, such as a supermarket or an ATM. Similarly, a statue of Einstein should use subject:wiki...
tags, unless there is an article about the statue itself. subject:wiki*
applies to many other cases, such as memorials boards and graves.
Linking to Humans
#defaultView:Editor{"taskId":"wikipedia_human_links", "comment": "Fix wp/wd link to a human, instead of a more specific subject/artist/...", "labels":{"a":"subject","b":"artist","c":"name:etymology"}, "vote":true }
SELECT
?id ?loc
(osmt:wikidata as ?tag_a1) (false as ?val_a1)
(osmt:subject:wikidata as ?tag_a2) (?wd as ?val_a2)
(if(?isWpAboutWd, osmt:wikipedia, false) as ?tag_a3) (false as ?val_a3)
(if(?isWpAboutWd, osmt:subject:wikipedia, false) as ?tag_a4) (?wp as ?val_a4)
(osmt:wikidata as ?tag_b1) (false as ?val_b1)
(osmt:artist:wikidata as ?tag_b2) (?wd as ?val_b2)
(if(?isWpAboutWd, osmt:wikipedia, false) as ?tag_b3) (false as ?val_b3)
(if(?isWpAboutWd, osmt:artist:wikipedia, false) as ?tag_b4) (?wp as ?val_b4)
(osmt:wikidata as ?tag_c1) (false as ?val_c1)
(osmt:name:etymology:wikidata as ?tag_c2) (?wd as ?val_c2)
(if(?isWpAboutWd, osmt:wikipedia, false) as ?tag_c3) (false as ?val_c3)
(if(?isWpAboutWd, osmt:name:etymology:wikipedia, false) as ?tag_c4) (?wp as ?val_c4)
WHERE {
# Limit to subjects that have a tag called "wikidata"
?id osmt:wikidata ?wd ;
osmm:loc ?loc .
# ?wd must be an "instance of" a human, or an instance
# of some type, which itself is a (sub-)*subclass of a human
?wd wdt:P31/wdt:P279* wd:Q5 .
# Check if wikipedia tag exists, and if it matches the wikidata tag
OPTIONAL { ?id osmt:wikipedia ?wp }
BIND( EXISTS{ ?wp schema:about ?wd } as ?isWpAboutWd)
}
OpenStreetMap represents objects, but not human beings. When OSM object links to a human being, there is a very good chance that's a mistake:
- statue/memorial/grave - should use subject:wikidata - Who does this feature represent?
- sculptor/painter - should use artist:wikidata - Who was the author/creator of this feature?
- named after - should use name:etymology:wikidata - Who was this feature named after?
Linking to Fictional Humans
#defaultView:Editor{"taskId":"wikipedia_fictional_human_links", "comment": "Fix wp/wd link to a fictional human, instead of using subject:wikidata" }
SELECT
?id ?loc
(osmt:wikidata as ?tag_1) (false as ?val_1)
(osmt:subject:wikidata as ?tag_2) (?wd as ?val_2)
(if(?isWpAboutWd, osmt:wikipedia, false) as ?tag_3) (false as ?val_3)
(if(?isWpAboutWd, osmt:subject:wikipedia, false) as ?tag_4) (?wp as ?val_4)
WHERE {
# Limit to subjects that have a tag called "wikidata"
?id osmt:wikidata ?wd ;
osmm:loc ?loc .
# ?wd must be an "instance of" a fictional human, or an instance
# of some type, which itself is a (sub-)*subclass of a fictional human
?wd wdt:P31/wdt:P279* wd:Q15632617 .
# Check if wikipedia tag exists, and if it matches the wikidata tag
OPTIONAL { ?id osmt:wikipedia ?wp }
BIND( EXISTS{ ?wp schema:about ?wd } as ?isWpAboutWd)
}
OpenStreetMap represents objects, but not human beings. When OSM object links to a human being, there is a very good chance that's a mistake. For fictional human beings, most likely it was meant to use subject:wikidata - Who does this feature represent?
subject:wikidata pointing to a sculptor
#defaultView:Map
SELECT
?osmId
(SAMPLE(?wdLabel) as ?label)
(SAMPLE(?wd) as ?wd)
(GROUP_CONCAT(DISTINCT(?occupation); separator=", ") as ?occupation)
(SAMPLE(?loc) AS ?loc)
WHERE {
# Get OSM elements with "subject:wikidata" tag and its location.
?osmId osmt:subject:wikidata ?wd ;
osmm:loc ?loc .
# The subject:wikidata must have occupation="sculptor",
# or a subclass of sculptor. Get all occupations of that person.
?wd wdt:P106/wdt:P279* wd:Q1281618 ;
wdt:P106 ?occ .
# Get labels for the Wikidata entry, and for all occupations
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,de,fr,it,pl,ru,es,sv,nl" .
?wd rdfs:label ?wdLabel .
?occ rdfs:label ?occupation
}
} GROUP BY ?osmId
Duplicate tags in wikipedia & brand:wikipedia
Frequently the same value is set on both wikipedia and brand:wikipedia
(or subject:wikipedia
, ...). Only one of them should be set. Same thing for *:wikidata
.
Duplicate tags on a relation and its members
#defaultView:Map
SELECT
?rel
(SAMPLE(?location) as ?location)
(sum(?failed) as ?failCount)
(count(?mwd) as ?memberWithWdCount)
(count(?member) as ?memberCount)
((count(?member) - count(?mwd)) as ?diffCount)
WHERE {
# Find relations with wikidata tag and at least one member
?rel osmm:type 'r';
osmt:wikidata ?wd;
osmm:loc ?location;
osmm:has ?member .
# Get member's type
?member osmm:type ?mtype .
# Get member's wikidata tag if it exists
OPTIONAL { ?member osmt:wikidata ?mwd }
# If any of the conditions are met, set ?failed to 1.
# The sum of ?failed must be 0 for the relation to be shown
BIND (if((?mtype='r' || ?mtype='n' || (bound(?mwd) && ?mwd!=?wd)), 1, 0) as ?failed)
}
GROUP BY ?rel
HAVING (?memberWithWdCount > 0 && ?failCount = 0)
ORDER BY DESC(?memberCount)
As described in Key:wikipedia, the tag should only be set on a relation, not on its members. In general, most common tags should be moved to the relation, such as multilingual and international names, wikipedia, and wikidata. The name tag should remain on each member to simplify identification.