Import/Milwaukee County, Wisconsin addresses
Goals
To add the vast majority of addresses in Milwaukee County, WI to OpenStreetMap without creating duplicates. This import is completed.
Progress/Schedule
The import should start in late 2022-early 2023 and last 1-2 months, depending on interest from local mappers.
- Data processing was done in late 2022.
- All active local mappers were messaged and asked for any comments/concerns and if they would like to participate (corporate and streetcomplete-only users were not queried)
- Feedback has been positive, just a comment to be careful when merging in the areas with some addressing already
- An email was sent to the imports mailing list on 2022-11-13.
- A modification to the QA based on feedback
- The import was started on 2022-11-25 via the accont popball-import (on osm, edits, contrib, heatmap, chngset com.)
- The import was finished on 2022-12-06
Import Data
Background
Data source site: https://gis-mclio.opendata.arcgis.com/maps/MCLIO::address-points
Data license:: Public Domain (confirmed with the municipality that this is the case)
Type of license: Public Domain
Link to permission: n/a
ODbL Compliance verified: Yes
The data is of rather good quality, and the address points are directly above the buildings they represent. Therefore conflation should yield good results.
Import Type
A one time import that will be completed in many small uploads.
Data Preparation
Data Reduction & Simplification
The data was converted to OSM xml using JOSM using the OpenData plugin.
The following fields were used:
- HOUSENO
- HOUSESX
- DIR
- STREET
- PDIR
- MUNI
- UNIT (mostly deleted)
- ZIP_CODE
- ADDR_STATUS (Only used for filtering)
Fields not relevant to OSM were deleted. These are:
- OBJECTID
- SOURCE_OID
- TAXKEY
- ALT_ID
- DATE_CHANGED
- COMMENT
- SOURCE
- SOURCE_DATE
- SOURCE_ID
- FULLADDR
- ADDR_STATUS
- STREET_LN_OID
- BLDG_POLY_ID
- MAILABLE
Using JOSM, the address points were filtered. Specifically, this removed houses without house numbers or without a street specified (house names do not exist in Milwaukee County, at least as part of the official address) In addition, addresses the ADDR_STATUS was used to detect addresses unwanted for import. This includes utility rights of way, freeway rights of way, railroad rights of way, waterways, and vacant lots. Parking lots addresses were also filtered out (the addresses of these are rarely verifiable on the ground and are usually duplicate of the buildings they serve. This data may be useful in a future import, however.
Using JOSM and Python scripts, the street types (ST, AVE, etc.), Directions (N, S, E, W), Post Directions (Like direction, but after the street name) were expanded into their abbreviated couterparts (e.g. ST -> Street). The fields were converted from all caps to title case. Then, using a python script, the street name, type, directions, and post directions were combined to get the addr:street=* field for OSM. Housenumber extensions (HOUSESX) were added onto the end of addresses (123 + A => 123A).
Using JOSM, the dataset was split into tracts of approximately 5000 addresses for easier manageability.
Zip codes often contained a Zip+4 code. A dash was added to fit the standard formatting of Zip+4.
Due to a bug in the JOSM OpenData import plugin, addresses "stacked" on one point were reduced to only one address in the stack. The additional addresses were added back by reading the CSV manually in with a python script and editing the OSM files. Stacked addresses had their housenumbers concatinated with commas unless they were on different streets, in which case a new address node was added.
Units numbers were removed for the most part when they corresponded to units in the same building. This helps simplify the merging process and should lead to more consistent results.
Duplicate addresses were detected with JOSM and cleaned up (after units were removed).
A few fixme=* tags were added. This was done where the dataset included the same housenumber for a number of buildings (this is accurate in some housing complexes) but it would be desirable for the exact unit numbers to be captured on the ground.
In cases where a single building has a large number of addresses (more than 10 for example) this was replaced with address interpolation (very long addr:housenumber=* are undesirable)
Tagging Plans
addr:housenumber=*, addr:street=*, addr:city=*, addr:postcode=*, and addr:state=* will be used on each address point. addr:unit=* will be added where available and appropriate. No source tags will be used on the addresses.
Changeset Tags
The changeset should have source=Milwaukee County LIO
Data Transformation Results
The scripts used to process the address points alongside a sample of the proccessed address points (as well as the entire compressed set of points) are available on github.
Data Merge Workflow
Team Approach
While getting local consensus, active local mappers will be asked if they want to participate in merging the data. If this is the case, then the processed tracts will be assigned to the mapper to import.
Workflow
Note: Do all import via a dedicated import account
- Open one tract in JOSM
- Within the tract, remove any address points not corresponding to addresses according to OSM standards. This includes addresses in freeway Right of Ways, Unility righ of ways, and demolished buildings, etc.(Most of these should have been removed already, but some may still remain)
- Run JOSM validation to find any anomolies (most importantly duplicate housenumbers).
- Manually conflate any non-building addresses with areas. This includes things like cemetaries, parks, etc. Also manually conflate any buildings which are multipolygons.
- Run conflation using the JOSM plugin to find matches. The 'subject' of the conflation should be any building=* as well as any points with addr:housenumber already filled in (to avoid duplication)
- Review the address nodes which did not match with anything
- In case it is a building with multiple address nodes, unmatch the node automatic conflation matched it with and keep the nodes within the building.
- Manually match buildings which automatic conflation missed
- Delete address nodes which don't refer to objects on the ground anymore. Typically this will happen if a building was demolished.
- Pay special attention to conflations with a large distance or a large distance, as these are more likely to be faulty conflations.
Conflation
The JOSM conflation tool will be used to conflate the addresses with the existing buildings (The vast majority of buildings already have outlines in the county from a prior import).
Quality Assurance
JOSM address data validation was run on the dataset, and will be run with the merged data before upload. Additionally, we will run JOSM/Plugins/FixAddresses, which scans addr:street=* names and compares them with the names of the surrounding streets.
As a "sanity test" of the data, the data was conflated locally in an area where many housenumbers were tagged already (the neighborhood bounded by Bluemound Road, 76th Street, Hawley Road, and I-94). Out of 1317 address nodes, 170 conflicts had to be resolved.
- 86 had a missing address extension in OSM (e.g. 170 vs 170,170A).
- 18 were conflicts due to mutiple addresses separated by commas vs using separate nodes.
- 17 conflicts were due to the street name (Blue Mound Road vs Bluemound Road; even street signs are inconsistent on this issue).
- 14 were due to the imported data node being in the wrong place. Two neighboring houses would have their addresses swapped in this case.
- 12 were due to a missing housenumber in OSM (on a house with multiple housenumbers.
- 9 were conflation conflicts due to nodes selecting the wrong matching node in the subject.
- 7 were due to incorrect data in OSM (invariably due to small typos).
- 5 were due to a missing address extension in the import dataset.
- 2 were due to a missing address in the import dataset.
Tract Status
Tract File | User | Status |
---|---|---|
addresses_tract1 | watmildon | Completed |
addresses_tract2 | popball | Completed |
addresses_tract3 | popball | Completed |
addresses_tract4 | watmildon | Completed |
addresses_tract5 | watmildon | Completed |
addresses_tract6 | popball | Completed |
addresses_tract7 | watmildon | Completed |
addresses_tract8 | watmildon | Completed |
addresses_tract9 | Popball | Completed |
addresses_tract10 | Popball | Completed |
addresses_tract11 | Popball | Completed |
addresses_tract12 | Popball | Completed |
addresses_tract13 | Popball | Completed |
addresses_tract14 | Popball | Completed |
addresses_tract15 | Popball | Completed |
addresses_tract16 | Popball | Completed |
addresses_tract17 | watmildon | Completed |
addresses_tract18 | watmildon | Completed |
addresses_tract19 | watmildon | Completed |
addresses_tract20 | watmildon | Completed |
addresses_tract21 | Popball | Completed |
addresses_tract22 | watmildon | Completed |
addresses_tract23 | watmildon | Completed |
addresses_tract24 | watmildon | Completed |
addresses_tract25 | Popball | Completed |
addresses_tract26 | watmildon | Completed |
addresses_tract27 | watmildon | Completed |
addresses_tract28 | watmildon | Completed |
addresses_tract29 | watmildon | Completed |
addresses_tract30 | watmildon | Completed |
addresses_tract31 | watmildon | Completed |
addresses_tract32 | watmildon | Completed |
addresses_tract33 | watmildon | Completed |
addresses_tract34 | watmildon | Completed |
addresses_tract35 | watmildon | Completed |
addresses_tract36 | watmildon | Completed |
addresses_tract37 | watmildon | Completed |
addresses_tract38 | watmildon | Completed |
addresses_tract39 | watmildon | Completed |
addresses_tract40 | watmildon | Completed |
addresses_tract41 | Popball | Completed |
addresses_tract42 | watmildon | Completed |
addresses_tract43 | Popball | Completed |
addresses_tract44 | Popball | Completed |
addresses_tract45 | Popball | Completed |
addresses_tract46 | Popball | Completed |
addresses_tract47 | Popball | Completed |
addresses_tract48 | Popball | Completed |
addresses_tract49 | Popball | Completed |
addresses_tract50 | Popball | Completed |
addresses_tract51 | Popball | Completed |
addresses_tract52 | Popball | Completed |
addresses_tract53 | Popball | Completed |
addresses_tract54 | Popball | Completed |
addresses_tract55 | Popball | Completed |
addresses_tract56 | Popball | Completed |
addresses_tract57 | Popball | Completed |
addresses_tract58 | Popball | Completed |
addresses_tract59 | Popball | Completed |
addresses_tract60 | Popball | Completed |
addresses_tract61 | Popball | Completed |
addresses_tract62 | Popball | Completed |
addresses_tract63 | Popball | Completed |
addresses_tract64 | Popball | Completed |
addresses_tract65 | Popball | Completed |
addresses_tract66 | Popball | Completed |
addresses_tract67 | popball | Completed |
addresses_tract68 | Popball | Completed |
addresses_tract69 | popball | Completed |
addresses_tract70 | Popball | Completed |
addresses_tract71 | Popball | Completed |
addresses_tract72 | Popball | Completed |
addresses_tract73 | Popball | Completed |
addresses_tract74 | Popball | Completed |