Creating vs Sampling: No Brightline
Is using a neural network to generate art different from using any other tool?
Over the last year, AI generated artwork crossed the threshold from useless to useful. And of course, this has led to serious debate on whether AI trained on specific artistic styles is theft, whether AI art should be copyrightable, and even whether AI artwork is worthy of being considered a creative enterprise.
It seems obvious to me that creators are at risk of getting royally fucked, and I am not surprised to see backlash. Artist communities have banned AI art (in policy; enforcement is another matter) while the various Hollywood Unions have made the regulation of AI use a key part of their negotiations. For many, AI is a direct threat to their livelihood.
That said, I think most of the conversation around AI artwork is confusing. Many artists, including those that I work with, believe that there is a clear distinction between AI generated artwork and other kinds of art. The argument, as best I understand it, is that artwork in general requires some amount of intent and skill
This seems wrong to me, but I couldn't pin down in words why that may be true. Instead, here are a few hypotheticals that hopefully help share my intuition.
Hypotheticals
Mia is a programmer who is creating images with random noise. She does this with a few lines of code that can create an image, each pixel randomly selected with RGB values. She runs the code a few times and is surprised to find that her tool has, entirely randomly, produced a beautiful stylized landscape that looks like it could be in a Disney movie.
Azraf is a roboticist who is working on a new kind of printer. The printer mixes ink together to produce colors, a different random color at each spot in an image. He runs the printer repeatedly until it generates something that he thinks is aesthetically pleasing, and ends up choosing a printout that happens to look like a Picasso.
Jenny is a graphic designer who has a unique method: she starts with an image that is entirely random noise, and continues to randomly generate different parts of the image (sometimes the whole image, sometimes a subsection, sometimes a pixel) until the image looks aesthetically pleasing or interesting. During her random generations, she creates an image that looks like Starry Night.
Derek is a surrealist photographer. He takes mundane photographs and uses Photoshop's 'content aware fill' (an algorithm that uses pattern matching to fill in parts of an image) to replace large sections of the image. Sometimes, the majority of the image will be modified by content aware fill.
Isabel is a hobbyist who uses Photoshop to doodle. She starts with an image that is entirely random noise, and then edits it using Photoshop's paintbrush until it looks like what she wants. Amanda works with Isabel. Instead of starting with a random image, Amanda chooses to start with a white canvas. Isabel and Amanda both often make art in the style of other artists.
David is a logo designer who uses MS Paint, for some reason. Often, he will make use of built in tools that allow him to draw perfectly straight lines or mathematically optimal curves. He uses various coloring tools — such as the paint bucket tool — and selection tools — such as the magic wand — to automatically fill in coloring on his logos.
Sarah is a UI wireframer who creates buttons and backgrounds for apps. When deciding on color schemes, she uses an online tool where she can describe the emotions she wants to convey, and the tool will select a color palette. Then she creates wireframe elements using Figma, akin to how David makes logos.
Austin is a fashion designer working with Louis Vuitton on designs for next spring. He uses an online tool where he describes the vague ideas floating around in his head, and the tool surfaces other images from the LV catalog that may be useful as inspiration. Hunter is also a fashion designer working with Louis Vuitton; he uses the same tool, but the tool will also sometimes generate an inspiration-image if there is nothing sufficiently useful in the catalog.
Eva is a photographer like Derek. She also uses Photoshop, but uses a more advanced version of content aware fill. When she selects a portion of the image, she will type in a description of what she wants that area to be filled with. Photoshop's advanced content-aware-fill uses the context of the rest of the image and the description to fill in the parts of the image Eva selected.
Jake is a painter, with a distinct visual style. He recently discovered a tool that takes in all of Jake's previous work, and can produce new works in a similar style using just a text description. Jake uses this tool to quickly generate new works of art, iterating on the text description until the output looks just right.
Amol is a programmer with a blog who needs a cover photo for his recent article. He uses a tool that takes in a text description and produces an image. Amol iterates over the text description, spending a long time to get the description exactly right to produce the right output image.
Where is the line?
There's a longstanding philosophical debate about mathematics: is math invented or discovered? The folks in the 'discovered' camp argue that mathematical laws are objective truths that exist independent of human observation. If every human in the universe ceased to exist tomorrow, 1 + 1 would still equal 2.
What if we extended this idea?
The Library of Babel comes from a short story of the same name. The idea is that there is a library containing every single book with every single combination of letters of a certain length. You will find a lot of absolute nonsense — most of the library is pages of illegible character strings. But within this library you will also find works from Shakespeare and Joyce, Stalin and Washington, Yates and Shelley. In some deep mathematical sense, these words existed long before humans existed, and will continue to exist long after.
Images work the same way. Every image that has ever existed, or ever will exist, can be found in a 'library' that contains every combination of pixel values of a certain height and width. Most of the images are total noise, like polar bears in a blizzard. But every photograph, every wood-carving, every postmodern whatever image lives there too.
Humans use our considerable brain power to traverse this RGB terrain of images, pulling out aesthetically pleasing works that have significance. And we use tools that help us — search engines to help us find inspiration, image editors to sharpen our lines or fill in colors, algorithms to fill in content. Not so long ago artists had to mix their own paints.
I spent a lot of time trying to figure out where I would draw the "everything past this hypothetical is theft" line in the list of scenarios described above. But I can't really bring myself to say that, for e.g., what Derek is doing with content aware fill is meaningfully different from what Jake is doing with AI.
All artists are sampling the RGB image space. All artists use a variety of tools to make that process easier. And AI, for all of its issues, is another such tool.
PS: I would love to hear about whether other people can cleanly delineate between AI and other art tools, and where they draw the line on what does or doesn't 'count' as art.