Feature detection: Cows, cars and red motorcycles

Automated feature detection in images is big business. Amazon offers a service for businesses to identify features in large collections of images. According to their website at aws.amazon.com/rekognition the service provides automatic labelling of elements in a picture (e.g. this is a person on a bike, here is a mountain peak, etc). You can identify if your company’s brand label happens to appear in someone’s photograph, identify inappropriate content or dangerous objects, recognise celebrities, and check if people have PPE (personal protective equipment).

The photosharing platform Flickr provides an API (application programming interface) for researchers to identify features within large numbers of publicly visible images. That has potential uses in detecting what people focus on as important, attractive or interesting about a place. Several platforms offer similar capability. The features are words (house, tree, sky, etc) with a confidence number attached.

Microsoft Word provides automatic “Alt Text” creation based on features for images that voice readers can recite for people with visual impairment. The automated text for this image reads: “A white building with a tower.”

Wordroom (wordroom.org) is a feature detection plugin to Adobe’s Light Room photo manipulation software. Here are some images and the features it detects automatically. The plugin doesn’t provide confidence levels, though the order of the words in the list gives a confidence ranking. With this plugin users of the software can select the features they want to adopt as tags to assist in search at some later date, or as the basis of “Alt Text” descriptors.

Google generates feature tags based on its access to vast stores of online imagery. These word tags are the basis of searching and filtering images as well as matching images that have similar features. For a brief exploration of Google’s image search see my post: Reverse image search. The current incarnation of the Google Image search facility (Google Lens) on a smartphone matches locations as well as images and returns links to relevant websites.

The app will also have a go at identifying species and types of animals, plants and things. The subframe around the key object is generated automatically.

So, automated feature detection is accessible to anyone with a smartphone or computer and a network connection. As shown here there’s a bias towards products. Red motorbikes are conspicuous consumer items. With the bike out of the frame the app is able to identify the location.

This excursion is preamble to an investigation of the ethical implications of automated feature detection.

References

Amoore, Louise. 2020. Cloud Ethics: Algorithms and the Attributes of Ourselves and Others. Durham, NC: Duke University Press
Simpson, J. 2020. 7 Best Image Recognition APIs. Nordic APIs, 17 March. Available online: https://nordicapis.com/7-best-image-recognition-apis/ (accessed 1 October 2021).