How to stylize an image to look like the work of a famous artist using a neural network: dealing with neural style transfer

Comstock

Style Neural Transfer is an optimization process that works with 3 images: a content picture, a style picture (such as an artist’s work), and an input picture. If you “mix” them, you get an input picture, adjusted according to the composition to the picture of the content in the image of the copied style.

Define content and style representations
To get an idea of the content and style of a picture, you first need to look at the intermediate layers of the model. Intermediate layers are feature maps that become more ordered as you deepen. In this case, it is worth using the VGG19 network architecture – a pre-trained image classification network. Middleware plays an important role in defining views. For the input image, you need to map the corresponding representations on these intermediate layers.

Why middle layers?
You may be wondering: why do these intermediate conclusions provide an opportunity to determine the style and content of the image? For the network to be able to classify an image (which it has already been trained to do), it must understand the image. This includes building complex representations of objects in an image from a group of pixels. This partly explains why convolutional neural networks can generalize well: they are able to notice consistency and determine the characteristics that are characteristic of a particular class (to distinguish, for example, a cat from a dog), regardless of background noise. Thus, somewhere between the input of an image and the output of the classification result of this image, there is a model that finds features in the input data. Accordingly, referring to this very intermediate point (i.e. layers), you can easily get an idea of the style and content of the image.