In an interesting report on the official Google blog, A New Landmark in Computer Vision, engineers described a research project to discover 5,000 primary landmarks in the world and then develop an algorithm to recognize and index images of those landmarks (pdf here)
Noting the “explosion of personal digital photography..and the phenomenal growth of landmark photo sharing in many websites like Picasa” they say “the time has come build a landmark recognition engine, on the scale of the entire globe.”
This is of course what travel photographers and specialized stock agencies have been doing for years.
What do the Google engineers have in mind? First, using 20 million images from Picasa and Panoramio they used GPS encoding to identify frequently-photographed places. Then they analyzed text classifications in an online travel guide, Wikitravel, to discover well known landmarks and matched the results with text descriptions in the image collection. Then they matched the images with each other, looking for visual similarity in subject matter, and removing mismatches (like maps, etc.)
The result is a mosaic of images of each landmark, using thousands of images by different photographers, taken on different days, in different weather and lighting conditions, from all points of view. Already this user generated content is being shown in Google Maps in conjunction with Street View.
Another sample of images of Paris in Street View is at SearchEngineWatch.
Where is this headed? By being able to automatically recognize any image or video clip of a world landmark just from visual information Google can create a worldwide database to support tour guide.
“This engine is not only to visually recognize the presence of certain landmarks in an image, but also contributes to a worldwide landmark database that organizes and indexes landmarks, in terms of geographical locations, popularities, cultural values and social functions, etc. Such an earth-scale landmark recognition engine is tremendously useful for many vision and multimedia applications. First, by capturing the visual characteristics of landmarks, the engine can provide clean landmark images for building virtual tourism of a large number of landmarks. Second, by recognizing landmarks, the engine can facilitate both content understanding and geo-location detection of images and videos. Third, by geographically organizing landmarks, the engine can facilitate an intuitive geographic exploration and navigation of landmarks in a local area, so as to provide tour guide recommendation and visualization.”
Other similar work is being done at the University of Washington by the team that helped develop Microsoft’s Photosynth. Their newest project, Finding Paths through the World’s Photos focuses on navigating paths through landmarks, so that the user can get a continuous view amalgamated from a mosaic of millions of images with smooth transitions from point to point.
All this promises end users a rich visual experience of the landmarks of the world.
But where does it leave classical travel photography, the millions of carefully crafted images by professional photographers?
More on that in the next post.