Title | bench4gis: Benchmarking Privacy-aware Geocoding with Open Big Data |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Harris, Daniel R., Delcher, Chris |
Conference Name | 2019 IEEE International Conference on Big Data (Big Data) |
Keywords | bench4gis, Benchmark testing, Big Data, Big Data Applications, big data privacy, data privacy, external transmission, geographic coordinates, geographic data, geographic information systems, Geospatial analysis, healthcare data, human factors, institutional regulations, Law, Medical services, Metrics, open big data sets, Open Source Software, organization, patient privacy laws, privacy concerns, privacy-aware geocoding solutions, pubcrawl, quality assurance, Resiliency, Scalability, sensitive data, surrogate data, viable geocoding strategies |
Abstract | Geocoding, the process of translating addresses to geographic coordinates, is a relatively straight-forward and well-studied process, but limitations due to privacy concerns may restrict usage of geographic data. The impact of these limitations are further compounded by the scale of the data, and in turn, also limits viable geocoding strategies. For example, healthcare data is protected by patient privacy laws in addition to possible institutional regulations that restrict external transmission and sharing of data. This results in the implementation of "in-house" geocoding solutions where data is processed behind an organization's firewall; quality assurance for these implementations is problematic because sensitive data cannot be used to externally validate results. In this paper, we present our software framework called bench4gis which benchmarks privacy-aware geocoding solutions by leveraging open big data as surrogate data for quality assurance; the scale of open big data sets for address data can ensure that results are geographically meaningful for the locale of the implementing institution. |
DOI | 10.1109/BigData47090.2019.9006234 |
Citation Key | harris_bench4gis_2019 |