OpenStreetMap, a crowdsourced geographic database, provides the only global-level, openly licensed source of geospatial road data, and the only national-level source in many countries. However, researchers, policy makers, and citizens who want to make use of OpenStreetMap (OSM) have little information about whether it can be relied upon in a particular geographic setting. In this paper, we use two complementary, independent methods to assess the completeness of OSM road data in each country in the world. First, we undertake a visual assessment of OSM data against satellite imagery, which provides the input for estimates based on a multilevel regression and poststratification model. Second, we fit sigmoid curves to the cumulative length of contributions, and use them to estimate the saturation level for each country. Both techniques may have more general use for assessing the development and saturation of crowd-sourced data. Our results show that in many places, researchers and policymakers can rely on the completeness of OSM, or will soon be able to do so. We find (i) that globally, OSM is ∼83% complete, and more than 40% of countries-including several in the developing world-have a fully mapped street network; (ii) that well-governed countries with good Internet access tend to be more complete, and that completeness has a U-shaped relationship with population density-both sparsely populated areas and dense cities are the best mapped; and (iii) that existing global datasets used by the World Bank undercount roads by more than 30%.
There is a growing interest in using OpenStreetMap [OSM] data in health research. We evaluate the usefulness of OSM data for researching the spatial availability of alcohol, a field which has been hampered by data access difficulties. We find OSM data is about 50% complete, which appears adequate for replicating findings from other studies using alcohol licensing data. Further, we show how OSM quality metrics can be used to select areas with more complete alcohol data. The ease of access and use may create opportunities for analysts and researchers seeking to understand broad patterns of alcohol availability.
The study of geographical systems as graphs, and networks has gained significant momentum in the academic literature as these systems possess measurable and relevant network properties. Crowd-based sources of data such as OpenStreetMaps (OSM) have created a wealth of worldwide geographic information including on transportation systems (e.g., road networks). In this work, we offer a Geographic Information Systems (GIS) protocol to transfer polyline data into a workable network format in the form of; a node layer, an edge layer, and a list of nodes/edges with relevant geographic information (e.g., length). Moreover, we have developed an ArcGIS tool to perform this protocol on OSM data, which we have applied to 80 urban areas in the world and made the results freely available. The tool accounts for crossover roads such as ramps and bridges. A separate tool is also made available for planar data and can be applied to any line features in ArcGIS.
Gaining access to inexpensive, high-resolution, up-to-date, three-dimensional road network data is a top priority beyond research, as such data would fuel applications in industry, governments, and the broader public alike. Road network data are openly available via user-generated content such as OpenStreetMap (OSM) but lack the resolution required for many tasks, e.g., emergency management. More importantly, however, few publicly available data offer information on elevation and slope. For most parts of the world, up-to-date digital elevation products with a resolution of less than 10 meters are a distant dream and, if available, those datasets have to be matched to the road network through an error-prone process. In this paper we present a radically different approach by deriving road network elevation data from massive amounts of in-situ observations extracted from user-contributed data from an online social fitness tracking application. While each individual observation may be of low-quality in terms of resolution and accuracy, taken together they form an accurate, high-resolution, up-to-date, three-dimensional road network that excels where other technologies such as LiDAR fail, e.g., in case of overpasses, overhangs, and so forth. In fact, the 1m spatial resolution dataset created in this research based on 350 million individual 3D location fixes has an RMSE of approximately 3.11m compared to a LiDAR-based ground-truth and can be used to enhance existing road network datasets where individual elevation fixes differ by up to 60m. In contrast, using interpolated data from the National Elevation Dataset (NED) results in 4.75m RMSE compared to the base line. We utilize Linked Data technologies to integrate the proposed high-resolution dataset with OpenStreetMap road geometries without requiring any changes to the OSM data model.
The concept of crowdsourcing is nowadays extensively used to refer to the collection of data and the generation of information by large groups of users/contributors. OpenStreetMap (OSM) is a very successful example of a crowd-sourced geospatial data project. Unfortunately, it is often the case that OSM contributor inputs (including geometry and attribute data inserts, deletions and updates) have been found to be inaccurate, incomplete, inconsistent or vague. This is due to several reasons which include: (1) many contributors with little experience or training in mapping and Geographic Information Systems (GIS); (2) not enough contributors familiar with the areas being mapped; (3) contributors having different interpretations of the attributes (tags) for specific features; (4) different levels of enthusiasm between mappers resulting in different number of tags for similar features and (5) the user-friendliness of the online user-interface where the underlying map can be viewed and edited. This paper suggests an automatic mechanism, which uses raw spatial data (trajectories of movements contributed by contributors to OSM) to minimise the uncertainty and impact of the above-mentioned issues. This approach takes the raw trajectory datasets as input and analyses them using data mining techniques. In addition, we extract some patterns and rules about the geometry and attributes of the recognised features for the purpose of insertion or editing of features in the OSM database. The underlying idea is that certain characteristics of user trajectories are directly linked to the geometry and the attributes of geographic features. Using these rules successfully results in the generation of new features with higher spatial quality which are subsequently automatically inserted into the OSM database.
This article describes a simple tool to display geophylogenies on web maps including Google Maps and OpenStreetMap. The tool reads a NEXUS format file that includes geographic information, and outputs a GeoJSON format file that can be displayed in a web map application.
- Gesundheitswesen (Bundesverband der Arzte des Offentlichen Gesundheitsdienstes (Germany))
- Published about 4 years ago
Background: Geocoding, the process of converting textual information (addresses) into geographic coordinates is increasingly used in public health/epidemiological research and practice. To date, little attention has been paid to geocoding quality and its impact on different types of spatially-related health studies. The primary aim of this study was to compare 2 freely available geocoding services (Google and OpenStreetMap) with regard to matching rate (percentage of address records capable of being geocoded) and positional accuracy (distance between geocodes and the ground truth locations). Methods: Residential addresses were geocoded by the NRW state office for information and technology and were considered as reference data (gold standard). The gold standard included the coordinates, the quality of the addresses (4 categories), and a binary urbanity indicator based on the CORINE land cover data. 2 500 addresses were randomly sampled after stratification for address quality and urbanity indicator (approximately 20 000 addresses). These address samples were geocoded using the geocoding services from Google and OSM. Results: In general, both geocoding services showed a decrease in the matching rate with decreasing address quality and urbanity. Google showed consistently a higher completeness than OSM (>93 vs. >82%). Also, the cartographic confounding between urban and rural regions was less distinct with Google’s geocoding API. Regarding the positional accuracy of the geo-coordinates, Google also showed the smallest deviations from the reference coordinates, with a median of <9 vs. <175.8 m. The cumulative density function derived from the positional accuracy showed for Google that nearly 95% and for OSM 50% of the addresses were geocoded within <50 m of their reference coordinates. Conclusion: The geocoding API from Google is superior to OSM regarding completeness and positional accuracy of the geocoded addresses. On the other hand, Google has several restrictions, such as the limitation of the requests to 2 500 addresses per 24 h and the presentation of the results exclusively on Google Maps, which may complicate the use for scientific purposes.
This paper investigates the influence of presenting volunteered and professionally created geographic information to 101 wheelchair users through an interactive website that included information collected by wheelchair-using volunteers. The aim of this experiment was to understand the influence that (1) knowing a map-based website contains volunteered information and (2) actually including volunteered information within an online interactive map (a mashup) have on the perceived trust of the user, described in terms of quality and authority. Analysis using Kruskal-Wallis showed that judgements of currency were influenced by including geo-information from untrained volunteers (volunteered geographic information) within the mashup, but not influenced by the participant being told that the online map contained volunteered information. The participants appeared to make judgements based on what information they saw, rather than what they were told about the source of the information.