AI Enter the World of Art
November 3, 2022
For the last 250 years, since the Industrial Revolution allowed people to use the power of coal and steel to power steam-belching trains and looming skyscrapers, the mantra of humanity has been this: progress is inevitable. Even through plagues, wars, and atrocities, it has continued relentlessly with a gaze fixed at the horizon. Along with this progress has come the computer, a device that automates complex but tedious work and increases leisure time for humans, just like any other innovation. With the advent of AI and machine learning in recent years (called the “Fourth Industrial Revolution” by the World Economic Forum), one should expect rapid advances in technology in all facets of life. Yet, even the most prescient experts could not have predicted the explosion in AI-created art this year, and they certainly could not have foreseen the consequences.
AI art is the result of years of progress in the field of machine learning. In the relevant subfield of machine learning, software engineers use models called neural networks. A neural network is essentially composed of neurons, which use mathematical functions to transform their inputs into different outputs. Input data is processed through these mathematical functions, and eventually, the user will receive a result.
For example, imagine making a neural network that can tell whether an image depicts a cat or a dog. The process goes like this: the computer picks a completely random set of formulas to put the image through, which eventually compresses the image into a probability (like 70% dog and 30% cat). The result with the highest likelihood will be the final answer. Because the formulas are random, the result will probably not be correct, so the computer compares its result to the correct answer and “grades” itself. Then, it modifies the network in such a way that the result will be better next time. After many generations of grading and correcting thousands of different images, it should have a set of operations that can consistently tell it whether it is looking at a cat or a dog. This is known as training the network.
AI art uses a neural network called a CLIP model, which has been trained on a huge database of images and text taken directly from the Internet, so instead of just associating images with dogs or cats, it associates text with images in general. The CLIP model encodes a piece of text into a format a computer can understand. Then, the system gathers information about the image. At first, the image will consist of pure noise, but a diffusion model works to remove the noise and create a coherent image. Finally, an decoder turns the information back into shapes and colors that we can understand. The result of all of these complex mathematical operations is that one can enter a prompt and generate a new image that corresponds to the prompt.
In January of 2022, OpenAI revealed an AI art generation software based on this system, named DALL-E 2. Thus began an arms race in AI that shows no signs of stopping anytime soon. Google soon released a model called Imagen, followed by another named Parti. Many other developers followed, including a research lab named Midjourney which released an eponymous system. With the release of Stability.Ai’s open-source (which denotes a software with publicly available code) Stable Diffusion, it became clear that the next step would be to release the technology to the public. On September 28, OpenAI decided to take this step, releasing DALL-E 2 as a paid service.
This technology has been the cause of much controversy ever since it was revealed. When one artist won first place in the Colorado State Fair (albeit in a category for art made with digital technology), many artists on social media voiced their displeasure with the technology that seemed to be making them obsolete. The fact that the databases use images without the permission of their authors also raised legal questions regarding the copyright status of AI-generated artworks.
One year ago, nobody could have expected this technology to go so far so quickly. Now, those same people are pondering the implications it will have for the world in the years to come.