Finally, I wanted to consider face recognition. To avoid uses that infringe on people’s privacy, ChatGPT has been designed to refuse when asked to identify people in images. However, when pressed for its best “guess”, it was willing to provide answers when I presented it with what’s known as the “famous faces doppelgangers test”.
A pair of faces is shown on each of 40 trials, along with a celebrity’s name, and participants are asked to identify which face is that particular celebrity (left or right). They’re also asked if they know the celebrity or not.
The task is made difficult because the other face is very similar in appearance to the celebrity – in other words, a doppelganger. People generally score around 81.5% for those trials where the celebrity is known to the person. (If they don’t know who the celebrity is, their choice would simply be a guess.)
Impressively, ChatGPT scored 100% correct across all of the trials for this test.
Putting it all together
On the basis of my experience, ChatGPT seems well-equipped to perform tasks related to the recognition and identification of human faces – including their expressions. It performed as well as or even better than people do for these three tests, at least.
Of course, these were my initial explorations rather than a peer-reviewed study, so more work is needed to firmly establish its abilities. But it does suggest that ChatGPT can handle face images.
ChatGPT is based on a type of artificial intelligence (AI) program called a large language model (LLM), which means that it has been trained on an extensive amount of text (and now image) data. This allows it to learn the structure and patterns that exist within the data, and subsequently generate sensible responses to almost any question or request by the user.
ChatGPT says that face images were also a significant part of its training data, although it doesn’t store and recall specific images. Instead, it appears to rely on the general patterns and associations it has learned during its training. Other sources seem to confirm this.
Presumably, through exposure to numerous face images alongside text that included the word “suspicious”, for example, it was able to develop a representation of that facial expression which was distinct from other expressions like “sarcastic”.
Similarly, refining its representation of a celebrity’s face through multiple exposures meant that it could subsequently differentiate them from other, similar-looking faces. However, again, this is admittedly informed speculation on my part.
Based on my results and other demonstrations of this latest version of the chatbot, it seems likely that ChatGPT’s already remarkable performance across a wide variety of tasks will continue to improve with each new version released.