Nominatim/TestCases
One of the simplest ways in which people can help with the development of Nominatim is to provide examples of searches together with the OSM item (node, way, relation) which they would expect the search term to return. It would be helpful if people could group test cases by country to make automated testing easier!
Please note the Name_finder:Abbreviations page for documentation of abbreviations. Please note [1] for general bug reporting. Test cases only.
For Example:
- Sheffield -> 422162 422162 -- any comments
- Dronfield station, gb -> 370186628 370186628
- Avon Close, Dronfield -> 32845608 32845608
coordinates
48°42'42N, 10°42'42E an other common conversions
BE
- Drij Dreven 2, Heusden-Zolder -> 232427735 232427735
- De Drij Dreven 2, Heusden-Zolder -> 232427735 232427735
DA
- Vestergade, København -> 1700503 1700503 -- Its seems that Nominatim ignores the "København" query part. I have to include "Vesterbro" in the query to get the expected result.
DE
- Delphinstr, Berlin -> 43635991 43635991
- Delphinstr, Berlin -> 32812862 32812862
- Plauenerweg, Murrhardt -> 4294911 4294911
- Plauener weg, Murrhardt -> 4294911 4294911
- Plauener, Murrhardt -> 4294911 4294911
- Berg, Lauf -> 32475118 32475118
- Bergstraße, Lauf -> 32475118 32475118
- Bergstr, Lauf -> 32475118 32475118
- Bergstr., Lauf -> 32475118 32475118
- Berg straße, Lauf -> 32475118 32475118
- Berg str, Lauf -> 32475118 32475118
- Berg str., Lauf -> 32475118 32475118
- Berg-straße, Lauf -> 32475118 32475118
- Berg-str, Lauf -> 32475118 32475118
- Berg-str., Lauf -> 32475118 32475118
- Berg, Lauf -> 33050250 33050250
- Bergstraße, Lauf -> 33050250 33050250
- Bergstr, Lauf -> 33050250 33050250
- Bergstr., Lauf -> 33050250 33050250
- Berg straße, Lauf -> 33050250 33050250
- Berg str, Lauf -> 33050250 33050250
- Berg str., Lauf -> 33050250 33050250
- Berg-straße, Lauf -> 33050250 33050250
- Berg-str, Lauf -> 33050250 33050250
- Berg-str., Lauf -> 33050250 33050250
- buschallee/suermondstr -> 98883766 98883766
- Obere, Lauf -> 33050250 33050250
- Obere Berg, Lauf -> 33050250 33050250
- Mauer, Lauf -> 35009632 35009632
- Dreiländerhalle, Passau -> 8058271 8058271
- Rheinstraße 33, München -> 31367396 31367396
- Rheinstrasse 33, München -> 31367396 31367396
- Clausthal -> 27073775 27073775
- Büchelberg -> 320945497 320945497
- Wörth -> 240052366 240052366
ES
- La Arena, Vizcaya -> 302232463 302232463
For several months most villages in the region of Cantabria, in Spain, are associated with the Basque village of La Arena. For example, if we look "Reinosa" we see the result:
Reinosa, La Arena, Cantabria, 39200, Spain, Europe
When right is:
Reinosa, Cantabria, 39200, Spain, Europe
The origin of the problem was an incorrect labeling of the locality "La Arena" as place=county which I corrected long ago. However as we see in Nominatim this relation has not been eliminated from the index and this error keeps appearing in searches (La Arena is still parent of most of Cantabria [2]).
I believe that this mistake can be the causer of whom cities as Bilbao, in the Basque Country, appear in Cantabria [3]. --Tony Rotondas 16:12, 29 June 2011 (BST)
- Residencia Universitaria Miguel de Unamuno -> 724184816 724184816
It´s tagged as building = dormitory, but in the search results it appears as "yes".
- Alfons Sola, Caldes Montbui -> 108411811 108411811 - Names should be splitted around the apostrophe ' character. Contraction of determinants (l', n') and prepositions (d') in french and catalan languages is mandatory most of the times the adjacent word begins (or ends) with a vocal (or h + vocal). Not splitting the tagged names turns a huge amount of items not found. This serious issue had been already reported in trac.osm.org but tickets just fell into oblivion. https://trac.openstreetmap.org/ticket/3604 https://trac.openstreetmap.org/ticket/3961
- Carrer d'Avelí Xalabarder, Caldes Montbui -> 59311053 59311053 - Another issue of the catalan language: http://en.wikipedia.org/wiki/Interpunct#Catalan . Most times people don't know the correct spelling of a word when it contains an interpunct. L·L should match either L, LL, or L·L.
- Llorens Artigas, Caldes Montbui -> 111823171 111823171 - Another issue of the catalan language: http://en.wikipedia.org/wiki/%C3%87 . Ç can be graphically confused with C and orally confused with S. Therefore Ç should match either Ç, C or S.
FI
- Itä-Pakila, Helsinki -> 189628 189628
- Itäpakila, Helsinki -> 189628 189628
- Manttaalitie, Helsinki -> 6007342 6007342, 24398212 24398212, 24398214 24398214
- Manttaalitie 32, Helsinki -> 28989986 28989986
- Manttaalitie 30A, Helsinki -> 30930931 30930931
- Manttaalitie 30, Helsinki -> 30930931 30930931
- 30B exists only in city plans, the house has a number on a different street, so the 30A should be returned
- Äijänpolku 10, Helsinki -> 446832 446832
- Äijänpolku 10 C, Helsinki -> 446832 446832, or 26121668 26121668
- flat C, should return either the multipolygon of the two buildings, or the correct building (the flat has a ref tag on the entrance)
- Oksasenkatu 1b, Helsinki -> 338322467 338322467
- Oksasenkatu 1b A, Helsinki -> 338322467 338322467
- Oksasenkatu 1b A 1, Helsinki -> 338322467 338322467
- housenumber is 1b, staircase A, flatnumber 1. Currently there's just a node for the building (the polygon is shared for multiple adjoining buildings)
FR
- Ille-et-Vilaine, Bretagne -> 7465 7465
- This département (adminlevel=6) is currently located incorrectly in the region (admin_level=4) "Pays de la Loire", instead of "Bretagne" (even though the region itself is tagged). This affects then all other cities (e.g. Rennes, the main city of Bretagne, tagged as its admin_center, is located in Pays de la Loire) and subdivisions of this department (the 3 arrondissements of the department, the EPCI's...).
- The bug is apparently caused by the fact that there's not an exact inclusion of the department within the region (whose stored geometry is still missing some details that still need to be fixed: a long task to perform and that has to be checked several times, for example when new islands are added, or coastal details are edited and added locally).
- In that case (absence of a strict inclusion), Nominatim attempts to locate the region according to the "nearest" feature, by computing a distance (from centroïds ?) and finds that the centroïd of Ille-et-Vilaine (near Rennes) is nearer from the centroïd of "Pays de la Loire" (near Angers), than from the centroïd of "Bretagne" (about midway between Carhaix-Plougher and Ploermel). The difference of these distance is very close (a few kilometers), but still "Pays de la Loire" wins.
- Nominatim should better solve the problem of absence of a strict inclusion by computing the surface of intersection between the candidate boundaries, in order to select the region that has the largest intersection (in this case, he would find absolutely NO intersection between Ille-et-Vilaine and Pays de la Loire, or something very near 0 km², if there are small artefacts).
- This problem could have been solved by making sure that ALL parts of an administrative subdivision are listed in the containing subdivision. But in this case, this concerns small islands (or islets and rocks) that may be added at any time in the smaller area, but still not reported in the larger one (due to server load problems, it may be difficult to add it immediately, as the larger region has a lot of defining ways and nodes). This problem can also occur when a smaller subdivision details some small enclaves from another similar subdivision that does not belong to the containing one.
- But may be Nominatim is just currently caching the computed centroids from areas that it has already processed, to save itself from having to cache locally all nodes and ways when it searches for neighboring regions. What it does is an heuristic, not an algorithm, and this causes such errors. Anyway, instead of using centroids in its cache, it could better use the bounding box as an approximation of the area, and would generate much less frequent errors.
- So, in case of doubts about which area contains another one, the containment relations should be logged and verified manually. There are frequent evidences that are easy verifiable by humans, simply by reading the existing tags, even if the exact geometries are not checked (because they are costly in terms of server load when requesting all the nodes and ways, the server may reject the query, or will require you to make smaller requests by loading not all ways and nodes at the same time).
- Nantes, Loire-Inférieure -> 59874 59874
- École 42 (or "school 42" or "42, France" or just "42") → 3957506 3957506
GB
- Sheffield -> 422162 422162
- Dronfield station, gb -> 370186628 370186628
- Avon Close, Dronfield -> 32845608 32845608
- Channel Tunnel -> 2147197 2147197
HR
- 7 Trg Ivana Kukuljevića, Zagreb -> 538819653 538819653 -- There is no street with that name, just housenumbers with addr:street=*
IE
- Talbot's Inch -> 133104956 133104956
IT
- Via Garibaldi,Livorno -> 92241010 92241010 -- the name of the way is "Via Giuseppe Garibaldi" where "Giuseppe" is the first name and is usually omitted in a user query.
- 520, Cannaregio -> 1068080535 1068080535 -- Venice addresses should use place:neighbourood or addr:neighbourhood, and not street name (they are uncorrelated). Same thing with "520, Cannaregio, Venezia". Instead of "Quartiere" there should be "Sestiere" (Venice is a unique case?)
JA
- 福岡 -> 331385074 331385074 -- The English name is "Fukuoka", which does seem to be found. The name tag is "福岡 (Fukuoka)", which is the standard format for Japanese name tags.
- 親富孝通り -> 43105756 43105756 -- A road in Fukuoka city. The name tag is "親富孝通り (Oyafuko-dori)". The Japanese name consists of 2 "words" ("親富孝/oyafuko" and "通り/dori/street"), with the second "word" being a mixture of kanji ("通/do") and hiragana ("り/ri").
NL
- 1 Watermolenpad, Culemborg -> 52534995 52534995 -- Doesn't use addr:street because street is a Cycleway
- 8 Standerdmolenpad, Culemborg -> 68827966 68827966 -- Cycleway problem again
- 1 Oudaen, Eindhoven -> 108007140 108007140 -- Search result on openstreetmap.org points at 1 Drakenstein (108007160 108007160)
PL
- Kraków, D17 -> 181823531 181823531
- Poland, AGH -> 157902552 157902552
- Kraków, AGH -> 157902552 157902552
- AGH -> 157902552 157902552
- AGH, A0 -> {{relation|3111004))
- Świętego Idziego, Kraków -> 201589921 201589921
RU
- 76, Большая Садовая улица, Ростов-на-Дону -> 1084709 1084709
- Тула should _not_ find Tulle, FR (104501 104501
TT
13 Carlos St, Port Of Spain -> 37412953 37412953 -- Way is coming back there is node: 1318085432 1318085432 with addr:house=13/addr:street=CARLOS ST/addr:city=PORT OF SPAIN that should have been returned. This consistently does not work in Trinidad but same node structure in Canada does consistently work.
US
- 2001 North Fuller Avenue, Los Angeles, CA -> 421224631 421224631
- 214 Biittig Road, Averill Park, NY US -> 1043772820 1043772820
- 1005 W Burnside -> 392913040 392913040 -- it works if you add just about any disambiguation: "St", "OR", "Portland", but without it gives locations in Scotland. Adding "US" as the disambiguation also suggests places in Virginia which are not named Burnside; Adding "USA" or "America" gives no results at all.
- Hammond, LA -> 151337978 151337978 -- Spelling out Louisiana gives correct result first, using "LA" finds result in Indiana first