What is Google Cloud Vision?

21st March 2022

Have you noticed how computer software is getting better at identifying the subject of an image?

Billions of images are shared online every single day, so it's fair to say that the job of categorising and tagging each of those images would be tricky to do manually. So, how do search engines and social media platforms know what images to show us when we conduct a search?

Research into 'computer vision' and image recognition technology was being conducted as early as the 1960s, but recent advances in artificial intelligence and machine learning have meant huge progress in this area, not least thanks to the Google Cloud Vision API.

Google Cloud Vision won't just identify whether the subject of an image is a man or a woman, a cat or a dog, or a boat or an aeroplane - it can even figure out if a person is happy or sad, or whether an image is suitable for Google Safe Search.

How does the Google Cloud Vision API work?

The Google Cloud Vision API uses machine learning to identify images from pre-trained models on huge datasets of images. It then classifies the images into thousands of categories to pick up on objects, places and faces and produces the results with a confidence value.

Developers can leverage the Google Cloud Vision API and easily integrate image recognition capability with their software, including:

Label and entity detection that identifies the dominant object within an image. This can be used to build metadata on your image catalogue which allows for image-based search.
Optical character recognition (OCR) for understanding text within an image. Google Cloud Vision can also automatically identify a broad range of different languages.
Safe Search detection that picks up on inappropriate content in an image. This is particularly useful for crowd-sourced content.
Facial detection that picks out faces in an image, including facial features like nose, eye and mouth position. This also allows it to identify emotions.
Landmark detection, along with the identification of related latitude and longitude.
Logo detection for recognisable product and brand logos within an image.

How Digital Asset Management (DAM) can leverage Google Cloud Vision

Image recognition is something of a gamechanger for Digital Asset Management (DAM).

Metadata is at the heart of Digital Asset Management, and the Google Cloud Vision API makes automatic tagging of images, as well as making recommendations for suitable metadata tags, simple.

ResourceSpace combines the state of the art Google Cloud Vision API with the open source OpenCV library, making metadata tagging quick and easy.

The plugin sends your images to the API on upload, setting appropriate metadata in pre-configured fields based on the subject of the image, as well as suggesting suitable tags.

As you can see in the above example, the Google Cloud Vision API has recognised that:

The resource type is a photo
The people in the photo are happy
The people in the image are 'in nature'
There is a tree and grass in the image
There's a 't-shirt', 'shirt', and 'smile' in the image
The image as depicting a leisure activity

Want to see ResourceSpace's Google Cloud Vision API plugin in action? Click below to launch your free DAM system and start uploading your images. You'll be up and running within minutes!