The Woman in the White Dress
Updated: Apr 16
Imagine a collaboration between you and an artist whose only job is to help you realize your creative vision. He possesses a broad range of skills and is at your beck and call. But there are two unusual restrictions: you are not allowed in his studio and the only way to communicate is by written notes that you push through a mail slot installed on his studio door. Each note will consist of a written description of an image you'd like for him to create. For each note he receives from you, the artist will make four image variations and pass them back through the mail slot for you to evaluate. If you aren’t satisfied, you can write a modified description and repeat this process as many times as you like. Generally, the more simple the description, the more freedom the artist has to express himself. If need be, you can also give the artist reference photos to help him understand what you want.
That, in a nutshell, is what it’s like to work with an image-based AI (at least for now, anyway.) In a way, the process is an inversion of the old saying “a picture paints a thousand words.” The AI is not just a craftsman who dutifully follows orders. It is a collaborator in the true sense of the word— it makes conceptual contributions to your project (whether you like them or not.) The output always comes with surprises. In other words, the AI cannot be corralled into recreating the exact image you have in your head. This, as you may have already discovered, can feel like herding cats. Indeed, you may get so frustrated that you find yourself writing mean prompts that denigrate the intelligence of the AI, “no, goddamnit, I want a sunset over the ocean, the ocean, not a puddle of water, the O-C-E-A-N. FUUUUCK.” But once I learned to go with the flow, the process became a more productive give-and-take.
Before I show you some examples, let me try to answer the big question: “where, exactly, do these images come from?” They are made in the same way a chef creates an original dish. His new creation didn’t come from the ether, it came from his vast knowledge of existing recipes, ingredients and cooking methods. Similarly, the images generated by AI come from it’s training data, which consists of existing images (from the internet etc) that are used to teach the AI how to generate new ones. So apparently these are not preexisting images or even a seamless collage of existing images. If you do a reverse image-search on an AI-created image, you will find no matches.
There are several AI image-generators out there and they all have their strengths and weaknesses. I'll be using Midjourney for this demo. I'm going to use a fashion photography theme for a couple of reasons: one, the human figure is a challenging test for AI (and human artists for that matter) because we are extremely sensitive to any weirdness in the human face, body and hands. If something is off, we know it immediately, even if we can't put our finger on what it is (Sargent famously defined the painted portrait as "a likeness with something wrong about the mouth.") Two, I've been following a young guy on Twitter, Nick St. Pierre (https://twitter.com/nickfloats?s=20) who is an AI whisperer. He used his "woman in a white dress" prompt to start a series of entertaining riffs from his followers and I think it's a good illustration of AI's capabilities and quirks. He's a good follow if you want to go deeper into all of this.
I'll start with a simple prompt and add details as we go. Eventually we'll get to St. Pierre's prompt, as well as his followers' variations.
Prompt 1: a woman wearing a white dress
interesting how Midjourney went the painting route-- notice the reddish ground coming through the greenish gray background, which reminds me of the background in portraits by Rembrandt and Velazquez.
Prompt 2: a woman in a white dress walking down a city street
another painting, though now more brushy and loose. Interesting how the AI uses the softness or sharpness of the painted edges to guide our attention to her face. Sargent-like, I suppose.
Prompt 3: a color photo of woman in a white dress walking down a city street
Wow, this one went off the rails. I tried to direct AI into the realm of photography and Midjourney seems to have put her dress on backwards. Or maybe her hair is covering her face and she's walking toward us?
Prompt 4: a color photo of woman in a white dress with classical features, soft skin, walking down a city street, medium format camera, f8
I added some details to the description. It seems like when you focus the AI's attention on specific features, it handles them better. I also added camera settings ( f/8 aperture) which had the intended Bokeh effect.
Prompt 5: 1960s street style fashion photo capturing a gorgeous 30-year-old woman with long brown hair, slightly blush cheeks, and a sly grin walking confidently on a bright spring morning in TriBeCa. She's wearing a stunning white lace Gucci gown with a full tulle skirt, intricate lace detailing, long lace sleeves, a high collar, and a fitted bodice adorned with delicate floral appliques. The soft lighting and careful composition emphasize the dreamy and romantic elegance of the gown.
This is St.Pierre's prompt and output. Before AI came along, I think most people would find this to be totally believable. There are lots of little quirks, for sure, like the hand in the background at the end of her hair and the creepy guy walking toward us (Michael Myers?!) But traditional photography produces weird distortions too, and fashion magazines were certainly not afraid to use the airbrush. Perhaps Midjourney was trying to reproduce those effects..?
And here are the variations:
The 1920s (Dante Gabriel Rossetti would have liked this one)
and just for the hell of it, here's a version of the movie Castaway where Wilson gets slapped:
You should know that the images I prompted (1-4) took me dozens of attempts which, with the exception of the backwards-walking redhead, I did not include here. When it comes to rendering humans, AI can be an evil, genetic scientist and I wanted to spare you the nightmares. Curious? OK, here's one from Dall-E, an AI that is notoriously bad at rendering humans and hands. I told you.
I know I said I would focus on painting for this article, but photography seemed like a clearer way to introduce people to image-based AI. Next week, AI vs Cezanne...