Tuesday, December 16, 2014

A picture is worth a thousand (coherent) words: building a natural description of images

A picture is worth a thousand (coherent) words: building a natural description of images. Google Research Blob.
Google has developed a machine-learning system that can automatically produce captions to accurately describe images the first time it sees them. It can describe a complex scene which requires a deeper representation of what’s going on in the scene, capturing how the various objects relate to one another and translating it all into natural-sounding language. The full paper "Show and Tell: A Neural Image Caption Generator" is here.

No comments: