Mechanical Edits/Mateusz Konieczny - bot account/fix many obvious typos

From OpenStreetMap Wiki
Jump to navigation Jump to search

Page content created as advised on Automated_Edits_code_of_conduct#Document_and_discuss_your_plans.

Who

I, Mateusz Konieczny using my bot account

contact

message via OSM I will respond also to PMs to the bot account. In both cases I will be notified about incoming PMs via email and notifications in OSM editors.

Why

Why it is useful? It helps newbies to avoid becoming confused. It protects against such values becoming established. Without drudgery that would be required from the manual cleanup. It also makes easier to add missing values and makes easier to use OpenStreetMap data, including support in editors which explain/translate meaning of surface values.

Why automatic edit? I have a massive queue (in thousands and tens of thousands) of automatically detectable issues which are not reported by mainstream validators, require fixes and fix requires review or complete manual cleanup.

There is no point in manual drudgery here, with values obviously fixable.

This values here do NOT require manual overview. If this cases will turn out to be an useful signal of invalid editing than I will remain reviewing nearby areas where bot edited.

I already skipped edits to primary tags except few blatant cases where mistake is easy to miss (flowerbed until recently was not rendered). Typos in primary tags that cause them to be outright missing from typical map rendering is often coupled with other serious problems. Probably because it indicates mapping by newbies who are likely to be confused also by other complexities. The same goes for access tags that I will keep fixing manually. Though typos in for example shop values are safe to fix automatically, probably because effects are less noticeable. Also, more obvious typos in rare, typically not rendered amenity tags are often safe to fix.

Yes, bot edit WILL cause objects to be edited. Nevertheless, as result map data quality will improve.

This values were found automatically based on taginfo and iD presets, also accessed via taginfo.

Taginfo values statistics list values in OSM database, while iD presets list which values are known for given keys.

Multiple heuristics were applies to find various typos, for example "cuisine=bubble tea" was found to match "cuisine=bubble_tea" from iD presets after space was replaced by underscore.

"cuisine=Thai" to "cuisine=thai" after lowercasing value.

"cuising=regional1" to "cuisine=regional" after skipping ending.

All values were looked at then manually to drop any dubious replacement (for example healthcare=nursery to heatlhcare=nurse was skipped).

Samples was also looked at in OSM, with many values just edited. Note that not each replacement was sampled: as many, many have just few cases, so sampling and verifying ends with just editing all of them manually.

If you see any values where edit would be dubious, not safe or in any way problematic: let me know.

(BTW, one typo in iD presets was found while looking for typos, see https://github.com/openstreetmap/id-tagging-schema/pull/1063 )

(also, bug in https://taginfo.openstreetmap.org/projects/historic_place#tags listing was found thanks to additional review - see https://www.facebook.com/groups/historic.place/permalink/2715780891914218/ - it was fixed by maintainers of that project )

Some conversion were found manually in addition to iD presets, currently it is only cycleway:both.

I also contacted community already in some cases (like sport values with ; and in some cases of extra trailing characters) - via changeset comments and notes.

Response confirmed that this changes are a good idea - and that just editing will be better than asking more people.

Numbers

Depends on how many new objects appear - depends on editing activity in OSM.


#What section has number of affected objects for each value as of 2024-03-03. None is used more than 1000 times.

How

state before a mechanical edit (example for a tunnel value):

state after an edit:

With changeset comment " fixing unusual tunnel values with a clear replacement". Bot changeset is also tagged with tags that mark it as automatic, provide link to discussion approving edit, link repository with source code etc

Discussion

Repetition

This is reoccurring edit and may be made as soon as new matching elements appear. At this moment triggering new edit requires human intervention so exact schedule is not predictable and bot may stop running at any moment.

This can change in a future. If bot is abandoned and does not run, feel free to ping me. If I am unable to run it any more feel free to use my code. Note that it may require going through bot approval process again and that code is on specific license.

https://codeberg.org/matkoniecz/OpenStreetMap_cleanup_scripts/src/branch/master/recurrent_bot_edits may have more up to date code version that what is listed on this page

What

Tags where value has clear and obvious replacement. None is used more than 1000 times as of 2024-03-03. Total transformation count was 4395 on that date though note that one object may be affected by more than one rule. Count listed as of 2024-03-03

Removed values, after approval

;q suffixes - see https://www.openstreetmap.org/changeset/148995504

Opt-out

Please write at https://community.openstreetmap.org/ in thread where discussion has taken place.

See #Discussion

See also