Research teams at Google and Stanford University, working independently, have announced advances in artificial intelligence technology used to recognize and describe the contents of videos and photographs. The researchers from the Stanford Artificial Intelligence Laboratory and the Google Brain project used a similar technique: overlapping neural networks, one trained to recognize images and one to recognize human language. They then trained the neural networks by showing them a selection of images paired with descriptions written by humans. After training on these relatively small sets of images, they then were shown unfamiliar images and asked to identify them, which they did with about double the accuracy of previous computer-vision technologies. Researchers say the most impressive aspect of the two efforts is they were able to teach computers to identify actions in an image, not just the objects being depicted. The technologies could have near-term applications in searching and archiving digital images, and longer-term applications in robot navigation and visual aids for the blind. Although the advances are impressive, experts say they remain extremely remote from human perceptual capabilities. IBM researcher John R. Smith notes computer-vision technology is still a long way off from having a real “understanding” of the content of images.
More info here: The New York Times (11/17/14) John Markoff