Concept: Google Earth
Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States
- Proceedings of the National Academy of Sciences of the United States of America
- Published over 3 years ago
The United States spends more than $250 million each year on the American Community Survey (ACS), a labor-intensive door-to-door study that measures statistics relating to race, gender, education, occupation, unemployment, and other demographic factors. Although a comprehensive source of data, the lag between demographic changes and their appearance in the ACS can exceed several years. As digital imagery becomes ubiquitous and machine vision techniques improve, automated data analysis may become an increasingly practical supplement to the ACS. Here, we present a method that estimates socioeconomic characteristics of regions spanning 200 US cities by using 50 million images of street scenes gathered with Google Street View cars. Using deep learning-based computer vision techniques, we determined the make, model, and year of all motor vehicles encountered in particular neighborhoods. Data from this census of motor vehicles, which enumerated 22 million automobiles in total (8% of all automobiles in the United States), were used to accurately estimate income, race, education, and voting patterns at the zip code and precinct level. (The average US precinct contains [Formula: see text]1,000 people.) The resulting associations are surprisingly simple and powerful. For instance, if the number of sedans encountered during a drive through a city is higher than the number of pickup trucks, the city is likely to vote for a Democrat during the next presidential election (88% chance); otherwise, it is likely to vote Republican (82%). Our results suggest that automated systems for monitoring demographics may effectively complement labor-intensive approaches, with the potential to measure demographics with fine spatial resolution, in close to real time.
BACKGROUND: A remote sensing technique was developed which combines a Geographic Information System (GIS); Google Earth, and Microsoft Excel to identify home locations for a random sample of households in rural Haiti. The method was used to select homes for ethnographic and water quality research in a region of rural Haiti located within 9 km of a local hospital and source of health education in Deschapelles, Haiti. The technique does not require access to governmental records or ground based surveys to collect household location data and can be performed in a rapid, cost effective manner. METHODS: The random selection of households and the location of these households during field surveys were accomplished using GIS, Google Earth, Microsoft Excel, and handheld Garmin GPSmap 76CSx GPS units. Homes were identified and mapped in Google Earth, exported to ArcMap 10.0, and a random list of homes was generated using Microsoft Excel which was then loaded onto handheld GPS units for field location. The development and use of a remote sensing method was essential to the selection and location of random households. RESULTS: A total of 537 homes initially were mapped and a randomized subset of 96 was identified as potential survey locations. Over 96% of the homes mapped using Google Earth imagery were correctly identified as occupied dwellings. Only 3.6% of the occupants of mapped homes visited declined to be interviewed. 16.4% of the homes visited were not occupied at the time of the visit due to work away from the home or market days. A total of 55 households were located using this method during the 10 days of fieldwork in May and June of 2012. CONCLUSIONS: The method used to generate and field locate random homes for surveys and water sampling was an effective means of selecting random households in a rural environment lacking geolocation infrastructure. The success rate for locating households using a handheld GPS was excellent and only rarely was local knowledge required to identify and locate households. This method provides an important technique that can be applied to other developing countries where a randomized study design is needed but infrastructure is lacking to implement more traditional participant selection methods.
Dryland biomes cover two-fifths of Earth’s land surface, but their forest area is poorly known. Here, we report an estimate of global forest extent in dryland biomes, based on analyzing more than 210,000 0.5-hectare sample plots through a photo-interpretation approach using large databases of satellite imagery at (i) very high spatial resolution and (ii) very high temporal resolution, which are available through the Google Earth platform. We show that in 2015, 1327 million hectares of drylands had more than 10% tree-cover, and 1079 million hectares comprised forest. Our estimate is 40 to 47% higher than previous estimates, corresponding to 467 million hectares of forest that have never been reported before. This increases current estimates of global forest cover by at least 9%.
Abstract Objective. To derive and validate a model that accurately predicts ambulance arrival time that could be implemented as a Google Maps web application. Methods. This was a retrospective study of all scene transports in Multnomah County, Oregon, from January 1 through December 31, 2008. Scene and destination hospital addresses were converted to coordinates. ArcGIS Network Analyst was used to estimate transport times based on street network speed limits. We then created a linear regression model to improve the accuracy of these street network estimates using weather, patient characteristics, use of lights and sirens, daylight, and rush-hour intervals. The model was derived from a 50% sample and validated on the remainder. Significance of the covariates was determined by p < 0.05 for a t-test of the model coefficients. Accuracy was quantified by the proportion of estimates that were within 5 minutes of the actual transport times recorded by computer-aided dispatch. We then built a Google Maps-based web application to demonstrate application in real-world EMS operations. Results. There were 48,308 included transports. Street network estimates of transport time were accurate within 5 minutes of actual transport time less than 16% of the time. Actual transport times were longer during daylight and rush-hour intervals and shorter with use of lights and sirens. Age under 18 years, gender, wet weather, and trauma system entry were not significant predictors of transport time. Our model predicted arrival time within 5 minutes 73% of the time. For lights and sirens transports, accuracy was within 5 minutes 77% of the time. Accuracy was identical in the validation dataset. Lights and sirens saved an average of 3.1 minutes for transports under 8.8 minutes, and 5.3 minutes for longer transports. Conclusions. An estimate of transport time based only on a street network significantly underestimated transport times. A simple model incorporating few variables can predict ambulance time of arrival to the emergency department with good accuracy. This model could be linked to global positioning system data and an automated Google Maps web application to optimize emergency department resource use. Use of lights and sirens had a significant effect on transport times. Key words: emergency medical services; prehospital emergency care.
Hi-C experiments study how genomes fold in 3D, generating contact maps containing features as small as 20 bp and as large as 200 Mb. Here we introduce Juicebox, a tool for exploring Hi-C and other contact map data. Juicebox allows users to zoom in and out of Hi-C maps interactively, just as a user of Google Earth might zoom in and out of a geographic map. Maps can be compared to one another, or to 1D tracks or 2D feature sets.
This article highlights applied understanding of classifying earth imaging data for land cover land use change (LCLUC) information. Compared to the many previous studies of LCLUC, the present study is innovative in that it applied geospatial data, tools and techniques for transdisciplinary research. It contributes to a wider discourse on practical decision making for multi-level governance. Undertaken as part of the BioDIVA project, the research adopted a multi-tiered methodical approach across three key dimensions: socioecology as the sphere of interest, a transdisciplinary approach as the disciplinary framework, and geospatial analysis as the applied methodology. The area of interest was the agroecosystem of Wayanad district in Kerala, India (South Asia). The methodology was structured to enable analysis of multi-scalar and multi-temporal data, using Wayanad as a case study. Three levels of analysis included: District (Landsat TM-30m), Taluk or sub-district (ASTER-15m) and Village or Gram Panchayat (GeoEye-0.5m). Our hypothesis, that analyzing patterns of land use change is pertinent for up-to-date assessment of agroecosystem resources and their wise management is supported by the outcome of the multi-tiered geospatial analysis. In addition, two examples from the project that highlight the adoption of LCLUC by different disciplinary experts are presented. A sociologist assessed the land ownership boundary for a selected tribal community. A faunal ecologist used it to assess the effect of landscape structure on arthropods and plant groups in rice fields. Furthermore, the Google Earth interface was used to support the overall validation process. Our key conclusion was that a multi-level understanding of the causes, effects, processes and mechanisms that govern agroecosystem transformation requires close attention to spatial, temporal and seasonal dynamics, for which the incorporation of local knowledge and participation of local communities is crucial.
Mapping species spatial distribution using spatial inference and prediction requires a lot of data. Occurrence data are generally not easily available from the literature and are very time-consuming to collect in the field. For that reason, we designed a survey to explore to which extent large-scale databases such as Google maps and Google street view could be used to derive valid occurrence data. We worked with the Pine Processionary Moth (PPM) Thaumetopoea pityocampa because the larvae of that moth build silk nests that are easily visible. The presence of the species at one location can therefore be inferred from visual records derived from the panoramic views available from Google street view. We designed a standardized procedure allowing evaluating the presence of the PPM on a sampling grid covering the landscape under study. The outputs were compared to field data. We investigated two landscapes using grids of different extent and mesh size. Data derived from Google street view were highly similar to field data in the large-scale analysis based on a square grid with a mesh of 16 km (96% of matching records). Using a 2 km mesh size led to a strong divergence between field and Google-derived data (46% of matching records). We conclude that Google database might provide useful occurrence data for mapping the distribution of species which presence can be visually evaluated such as the PPM. However, the accuracy of the output strongly depends on the spatial scales considered and on the sampling grid used. Other factors such as the coverage of Google street view network with regards to sampling grid size and the spatial distribution of host trees with regards to road network may also be determinant.
Air pollution affects billions of people worldwide, yet ambient pollution measurements are limited for much of the world. Urban air pollution concentrations vary sharply over short distances (≪1 km) owing to unevenly distributed emission sources, dilution, and physicochemical transformations. Accordingly, even where present, conventional fixed-site pollution monitoring methods lack the spatial resolution needed to characterize heterogeneous human exposures and localized pollution hotspots. Here, we demonstrate a measurement approach to reveal urban air pollution patterns at 4-5 orders of magnitude greater spatial precision than possible with current central-site ambient monitoring. We equipped Google Street View vehicles with a fast-response pollution measurement platform and repeatedly sampled every street in a 30-km(2) area of Oakland, CA, developing the largest urban air quality data set of its type. Resulting maps of annual daytime NO, NO2, and black carbon at 30 m-scale reveal stable, persistent pollution patterns with surprisingly sharp small-scale variability attributable to local sources, up to 5-8× within individual city blocks. Since local variation in air quality profoundly impacts public health and environmental equity, our results have important implications for how air pollution is measured and managed. If validated elsewhere, this readily scalable measurement approach could address major air quality data gaps worldwide.
The assessment of a species' habitat is a crucial issue in ecology and conservation. While the collection of habitat data has been boosted by the availability of remote sensing technologies, certain habitat types have yet to be collected through costly, on-ground surveys, limiting study over large areas. Cliffs are ecosystems that provide habitat for a rich biodiversity, especially raptors. Because of their principally vertical structure, however, cliffs are not easy to study by remote sensing technologies, posing a challenge for many researches and managers working with cliff-related biodiversity. We explore the feasibility of Google Street View, a freely available on-line tool, to remotely identify and assess the nesting habitat of two cliff-nesting vultures (the griffon vulture and the globally endangered Egyptian vulture) in northwestern Spain. Two main usefulness of Google Street View to ecologists and conservation biologists were evaluated: i) remotely identifying a species' potential habitat and ii) extracting fine-scale habitat information. Google Street View imagery covered 49% (1,907 km) of the roads of our study area (7,000 km(2)). The potential visibility covered by on-ground surveys was significantly greater (mean: 97.4%) than that of Google Street View (48.1%). However, incorporating Google Street View to the vulture’s habitat survey would save, on average, 36% in time and 49.5% in funds with respect to the on-ground survey only. The ability of Google Street View to identify cliffs (overall accuracy = 100%) outperformed the classification maps derived from digital elevation models (DEMs) (62-95%). Nonetheless, high-performance DEM maps may be useful to compensate Google Street View coverage limitations. Through Google Street View we could examine 66% of the vultures' nesting-cliffs existing in the study area (n = 148): 64% from griffon vultures and 65% from Egyptian vultures. It also allowed us the extraction of fine-scale features of cliffs. This World Wide Web-based methodology may be a useful, complementary tool to remotely map and assess the potential habitat of cliff-dependent biodiversity over large geographic areas, saving survey-related costs.
Although there is global growth in outdoor smokefree areas, little is known about the associated smokefree signage. We aimed to study smokefree signage at playgrounds and to compare field observations with images from Google Street View (GSV).