Generative Adversarial Networks (GANs): Engine and Applications




翻译 乾树

How generative adversarial nets are used to make our life better

Generative adversarial networks (GANs) are a class of neural networks that are used in unsupervised machine learning. They help to solve such tasks as image generation from descriptions, getting high resolution images from low resolution ones, predicting which drug could treat a certain disease, retrieving images that contain a given pattern, etc.
The Statsbot team asked a data scientist, Anton Karazeev, to make the introduction to GANs engine and their applications in everyday life.
生成对抗网络(GANs)是属于无监督学习的一种神经网络。它们有助于解决按文本生成图像,提高图片分辨率,按药方抓药,检索特定模式的图片等任务.Statsbot小组请求数据科学家Anton Karazeev 在日常生活中引入GAN原理及其应用。

GANs were introduced by Ian Goodfellow in 2014. They aren’t the only approach of neural networks in unsupervised learning. There’s also the Boltzmann machine (Geoffrey Hinton and Terry Sejnowski, 1985) and Autoencoders (Dana H. Ballard, 1987). Both of them are dedicated to extract features from data by learning the identity function f(x) = x and both of them rely on Markov chains to train or to generate samples.
Gan由Ian Goodfellow在2014年提出。它们不是神经网络应用在无监督学习中的唯一途径。 还有玻尔兹曼机(Geoffrey Hinton和Terry Sejnowski,1985)和自动解码器(Dana H. Ballard,1987)。 他们都致力于通过学习恒等函数f(x)= x从数据中提取特征,并且它们都依赖马尔可夫链来训练或生成样本。

Generative adversarial networks were designed to avoid using Markov chains because of the high computational cost of the latter. Another advantage relative to Boltzmann machines is that the Generator function has much fewer restrictions (there are only a few probability distributions that admit Markov chain sampling).
生成对抗网络设计之初衷就是避免使用马尔可夫链,因为后者的计算成本很高。 相对于玻尔兹曼机的另一个优点是生成对抗网络的限制要少得多(只有几个概率分布适用于马尔可夫链抽样)。

In this article, we’ll tell you how generative adversarial nets work and what their most popular applications in real life are. We will also give you links to some helpful resources for getting deeper into these approaches.
The engine of generative adversarial nets
To explain GANs’ concept let us use an analogy.
让我们用一个比喻解释 GAN 的原理吧。

Imagine you want to buy good watches. If you never buy them it’s very likely that you can’t distinguish brand watches from fake ones. You have to be experienced to not let yourself be fooled by the seller.
As you start to label most of the watches as fake (after a number of mistakes of course), the seller will start to “generate” more compelling copies of the watches. This example demonstrates the behavior of generative adversarial networks: Discriminator (watches buyer) and Generator (seller of fake watches).
现在想象一下你想买块好表。 如果你从未买过表,那么你很可能难辨真假。 你必须有买表的经验,以免被奸商欺骗。当你开始将大多数手表标记为假(当然是被骗之后),卖家将开始“生产”更逼真的山寨表。 这个例子形象地解释了生成对抗网络的基本原理:图像判别模型(手表买家)和图片生成模型(假手表的卖家)。

These two networks, Discriminator and Generator, are contesting with each other. This technique allows the generation of realistic objects (e.g. images). The Generator is forced to generate samples that look real and the Discriminator learns to distinguish generated samples and samples from real data.
图像判别模型和图片生成模型相互博弈。该技术允许生成现实对象(例如图像)。 图片生成模型强制生成看似真实的样本,图像判别模型学习分辨生成的样本和真实样本。
What’s the difference between discriminative and generative algorithms? In brief: discriminative algorithms learn the boundaries between classes (as the Discriminator does) while generative algorithms learn the distribution of classes (as the Generator does).

If you’re ready to go deeper
To learn the Generator’s distribution, p_g over data x, the distribution on input noise variables p_z(z) should be defined. Then G(z, θ_g) maps z from latent space Z to data space and D(x, θ_d) outputs a single scalar — probability that x came from the real data rather than p_g.
要想了解图片生成模型的分布,应该定义数据x的参数 p_g,以及输入噪声变量 p_z(z)的分布。 然后G(z,θ_g)将z从潜在空间Z映射到数据空间,D(x,θ_d)输出来自实数据而不是p_g的单个标量概率x。

The Discriminator is trained to maximize the probability of assigning the correct label to both examples of real data and generated samples. While the Generator is trained to minimize log(1 — D(G(z))). In other words — to minimize the probability of the Discriminator’s correct answer.
It is possible to consider such a training task as minimax game with value function V(G, D):
训练图像判别模型以最大化正确标注实际数据和生成样本的概率。训练图片生成模型用于最小化log(1-D(G(z)))。 换句话说 – 尽量减少图像判别模型得出正确答案的概率。
In other words — the Generator tries harder to fool the Discriminator and the Discriminator becomes more captious in order to not be fooled by the Generator.
“Adversarial training is the coolest thing since sliced bread.” — Yann LeCun
The process of training stops when the Discriminator is unable to distinguish p_g and p_data, i.e. D(x, θ_d) = ½ . Equilibrium between errors of the Generator and the Discriminator is established.
“对抗训练是继切片面包之后最酷的事情。” – Yann LeCun
当图像判别模型不能区分p_g和p_data,即D(x,θ_d)= 1/2时,训练过程停止。 达成图片生成模型和图像判别模型之间判定误差的平衡。

Image retrieval for historical archives
An interesting example of GANs applications is retrieving visually similar marks in “Prize Papers,” one of the most valuable archives in the field of maritime history. Adversarial nets make it easier to work with documents of historical importance containing information about the legitimacy of ship captures at sea.
GANs应用程序的一个有趣的例子是在“Prize Papers”中检索相似的标记,Prize Papers 是海洋史上最具价值的档案之一。 对抗网络使得处理这些具有历史意义的文件更加容易,这些文件还包括海上扣留船只是否合法的信息。

Each query contains examples of Merchant Marks — unique identification of property of a merchant, sketch-like symbols that are similar to hieroglyphs.
每个查询到的记录都包含商家标记的样例- – 商家属性的唯一标识,类似于象形文字的草图样符号。

Feature representation of every mark should be obtained, but there are some problems of applying conventional machine and deep learning methods (including Convolutional neural networks):

they require a large amount of labelled images;
there are no labels for Merchant Marks;
marks are not segmented from the dataset.
This new approach shows how to extract and learn features from images of the Merchant Marks using GANs. After the representation of each mark is learned the visual search on scanned documents could be processed.
这种新方法显示了如何使用GANs 从商标的图像中提取和学习特征。 在学习每个标记的表示之后,就可以在扫描文档上的按图形搜索。

Text translation into images
Other researchers showed that it’s possible to use descriptive properties of natural language to generate corresponding images. A method of text translation into images allows the illustration of the performance of generative models to mimic samples of real data.

The main problem of image generation is that image distribution is multimodal. For example, there are many correct samples that correctly illustrate the description. GANs help to solve this problem.
Let’s consider the following task of mapping the blue input dot to the green output dot (green dots are possible outputs to blue dot). This red arrow indicates the error of prediction and means that after some time the blue dot will be mapped to the mean of the green dots — this exact thing causes the blurry images we are trying to predict.
Generative adversarial nets don’t directly use pairs of inputs and outputs. Instead, they learn how the inputs and outputs can be paired.
Here are the examples of generated images from text descriptions:
Datasets that were used to train GANs:
Caltech-UCSD-200–2011 is an image dataset with photos of 200 bird species. Total number of images is 11,788;
Oxford-102 Flowers dataset consists of 102 flower categories with numbers between 40 and 258 images per category.
Oxford-102 花数据集由102个花类别组成,每个类别包含40到258张图片不等。

Drug Discovery
While others apply generative adversarial networks to images and videos, researchers from Insilico Medicine proposed an approach of artificially intelligent drug discovery using GANs.
The goal is to train the Generator to sample drug candidates for a given disease as precisely as possible to existing drugs from a Drug Database.
当其它研究员应用生成对抗网络处理图片和视频时,Insilico 医学的研究人员提出了一种运用GANs的人工智能的按方抓药的方法。
After training, it’s possible to generate a drug for a previously incurable disease using the Generator, and using the Discriminator to determine whether the sampled drug actually cures the given disease.

Molecule development in oncology
Another research by Insilico Medicine showed the pipeline of generating new anticancer molecules with a defined set of parameters. The aim is to predict drug responses and compounds which are good at fighting against cancer cells.
Researchers proposed an Adversarial Autoencoder (AAE) model for identification and generation of new compounds based on available biochemical data.
Insilico 医学的另一个研究表明,产生一组按参数定义的新的抗癌分子的管道。其目的是预测具有抗癌作用的药物反应和化合物。
“To the best of our knowledge, this is the first application of GANs techniques within the field of cancer drug discovery.” — say the researchers.
There are many available biochemical data in databases such as Cancer Cell Line Encyclopedia (CCLE), Genomics of Drug Sensitivity in Cancer (GDSC), and NCI-60 cancer cell line collection. All of them contain screening data for different drug experiments against cancer.
“据我们所知,这是GANs技术在挖掘癌症药物领域的首个应用。” – 研究人员说。
数据库中有许多可用的生物化学数据,如癌细胞系百科全书(CCLE),肿瘤药物敏感基因学(GDSC)和NCI-60癌细胞系。 所有这些都包含针对癌症的不同药物实验的筛选数据。
Adversarial Autoencoder was trained using Growth Inhibition percentage data (GI, which shows the reduction in the number of cancer cells after the treatment), drug concentrations, and fingerprints as inputs.
The fingerprint of the molecule contains a fixed number of bits in which each bit represents the absence or presence of some feature.
The latent layer consists of 5 neurons, one of which is responsible for GI (efficiency against cancer cells) and the four others are discriminated with normal distribution. So, a regression term was added to the Encoder cost function. Furthermore, the Encoder was restricted to map the same fingerprint to the same latent vector, independently from input concentration by additional manifold cost.
隐藏层由5个神经元组成,其中一个负责GI(癌细胞抑制率),另外四个由正态分布判别。 因此,一个回归项被添加到编码器代价函数中。 此外,编码器只能将相同的指纹映射到相同的潜在向量,这一过程独立于通过额外的流形代价集中输入。
After training, it is possible to generate molecules from a desired distribution and use a GI-neuron as a tuner of output compounds.
Results of this work are the following: the trained AAE model predicted compounds that are already proven to be anticancer agents and new untested compounds that should be validated with experiments on anticancer activity.
“Our results suggest that the proposed AAE model significantly enhances the capacity and efficiency of development of the new molecules with specific anticancer properties using the deep generative models.”
这项工作的成果如下:训练过的AAE 模型预测得到的化合物已经被证明是抗癌药物和需接受抗癌活性化合物实验验证的新药,。

Unsupervised learning is a next frontier in artificial intelligence and we are moving towards it.
Generative adversarial nets can be applied in many fields from generating images to predicting drugs, so don’t be afraid of experimenting with them. We believe they help in building a better future for machine learning.
Below, we give you a few helpful resources to learn more about adversarial nets.

Taken from “Generative Adversarial Nets”:
GANs allow the model to learn that there are many correct answers (i.e. handling well on multimodal data);
semi-supervised learning: features from the Discriminator or inference net could improve performance of classifiers when limited labeled data is available;
one can use adversarial nets to implement a stochastic extension of the deterministic Multi-Prediction Deep Boltzmann Machines;
a conditional generative model p(x|c) can be obtained by adding c as the input to both the Generator and the Discriminator.
可以通过将c输入是生成器和判别器获得条件生成模型P(x | C)。

Further Reading

What is a Variational Autoencoder?
Ian Goodfellow about GANs for Text on Reddit
“StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks” by Baidu Research [github]
“Generative Visual Manipulation on the Natural Image Manifold” by Adobe Research [github]
“Unsupervised Cross-Domain Image Generation” by Facebook AI Research [github]
“Image-to-Image Translation with Conditional Adversarial Networks” by Berkeley AI Research [github]
Ian Goodfellow在Reddit上的关于GANs的文章
Adobe Research的“自然图像流形的生成视觉操作”[github]
Facebook AI研究所的“无监督跨域图像生成”[github]

Share this to:


电子邮件地址不会被公开。 必填项已用*标注