How did AI painting take off? From history to technological breakthroughs, read the history of the hot AI painting in one article

By    21 Sep,2022

Let's go into a little bit of technical detail here: how troublesome is AI painting based on deep learning models, and how come days of training on large clusters of computers, which were already very modern in 2012, only yielded poor results?


Readers may have a basic idea that training a deep learning model is simply the process of using a large amount of externally labeled training data as input, and repeatedly adjusting the internal parameters of the model to match the input with the corresponding expected output.


So the process of making AI learn to paint is to build training data of existing paintings and iteratively adjust the parameters into the AI model.


How much information does a painting have? First of all, it is a length x width of RGB pixels.  The simplest starting point for a computer to learn to draw is to get an AI model that outputs a regular combination of pixels.


But not all RGB pixels together are paintings, and they can be just noise. A painting with rich texture and natural strokes has many strokes, involving parameters such as the position, shape, and color of each stroke in the painting, and the combination of parameters involved is very large. The computational complexity of deep model training increases dramatically with the growth of the parameter input combinations...  You can understand why this is not simple.


After Enda Wu and Jeff Dean's groundbreaking cat face generation model, AI scientists began to move forward into this new and challenging field. In 2014, AI academics proposed a very important deep learning model, the famous Generative Adverserial Network (GAN).

As the name "adversarial generation" suggests, the core idea of this deep learning model is to let two internal programs "generator" and "discriminator" PK each other to get the result.


The GAN model has been popular in AI academia since its introduction and has been widely used in many fields. It then became the foundation framework for many AI painting models, where the generator is used to generate images and the discriminator is used to judge the quality of the images. The advent of GAN has greatly facilitated the development of AI painting.


However, AI painting with the basic GAN model also has obvious drawbacks, one of which is that it has little control over the output results and tends to produce random images, while the output of AI artists should be stable. Another problem is the low resolution of the generated images.


The resolution problem is fine, but GAN has a deadlock at the "creation" point, which is its core feature: according to the basic architecture of GAN, the discriminator has to determine whether the generated image is of the same class as other images already provided to the discriminator, which determines that in the best case, the output image is an imitation of existing works, not an innovation. ......


In addition to the adversarial generative network GAN, researchers have also begun to use other kinds of deep learning models to try to teach AI to draw.


A well-known example is Deep Dream, an imaging tool released by Google in 2015. Deep Dream released a series of paintings, which attracted a lot of attention at once. Google even curated an exhibition of Deep Dream's work.


But if we look at it more closely, Deep Dream is more like an advanced AI filter than an AI painting, and its filter style can be understood by looking at the above works.


Compared to Deep Dream, which is not an embarrassing work, Google is more reliable than a model trained by thousands of hand-drawn sketches in 2017, through which the AI can draw some sketches. (Google, "A Neural Representation of Sketch Drawings")


3/5

POPULAR CATEGORY