Diamond Discussions: Three Problems With Current AI-Generated Photography Technology

Images generated by Artificial Intelligence are rapidly turning up in the cultural zeitgeist. As the general public becomes introduced to these images, it is a common belief that these images could replace art and photographs altogether, or at least make it difficult to distinguish between the two. However, there are a few limitations this technology has that make it easier to spot an AI photograph vs. a real photograph:

AI has trouble with tangential context: Even though AI uses prompts to get a general idea of what should be included in a composition, it lacks the wider connections to elements within the images. “A picture is worth a thousand words” means that the artist can tell a complete story in a single image, where each element relates in time to the next. This can take the viewer on a journey thought time and space. The artist can bring the viewer through their own imagination into the past, present, and into the future. The artist can imply certain meanings, impart wisdom, or biting wit with their choices. These are aspects of art that require a relationship between the viewer and the artist. Since current AI models cannot implement such complex ideas into the artwork with precision, the relationship between the viewer and the artist is severed.
AI has trouble with direct context: While AI can be prompted to create photo-realistic imagery, it lacks real-world knowledge on how things in the world work. Take the simple act of cooking for example. Humans have been cooking meals since before written words. It’s a basic concept, one born out of necessity for survival. Humans just could not safely eat meat in raw form, so we cooked the meat over a fire to destroy any bacteria that may make us sick. AI algorithms currently have a difficult time understanding this concept. To the algorithm cooking = “meat or vegetables + fire”. You will get all kinds of bizarre results based on this very reductive equation; slabs of meat completely on fire, burgers with candle flames coming out of them, or even just a massive fire-ball exploding right in front of a smiling human face. Until AI can understand direct context (why or how) an object exists in the world, then prompt results will be mixed at best.

This problem can also be observed in subtler ways as well. Such as mismatched lighting, clothing garments and accessories with impossible placement, etc.
AI has trouble with spatial and temporal concepts: This can be seen in everything to impossible physics, objects interacting together impossibly, and strange idiosyncrasies between objects in a composition. You can find images that look somewhat believable at first glance, but upon scrutiny, they fall apart. Such as mugs of beer held in a sticky spider-man grip, a man seemingly fused to a display case while shopping at the grocery store, or a group of women dining together all with an eerily similar face.

In conclusion, AI generated imagery is a powerful new technology that while still plagued with problems, will continue to improve. When the technology is advanced enough to find context, purpose, and able to hide its blemishes, I foresee a time when AI generated photos are indecipherable from the real thing

No Comments

Sorry, the comment form is closed at this time.

Diamond Discussions: Three Problems With Current AI-Generated Photography Technology

No Comments

Latest Blog Posts