Generative Adversarial Networks (GANs): Engine and Applications

原文链接:https://blog.statsbot.co/generative-adversarial-networks-gans-engine-and-applications-f96291965b47

ref:https://zhuanlan.zhihu.com/p/26994666

GAN原理及其应用

翻译 乾树

How generative adversarial nets are used to make our life better

生成对抗网络如何让生活更美好?
Generative adversarial networks (GANs) are a class of neural networks that are used in unsupervised machine learning. They help to solve such tasks as image generation from descriptions, getting high resolution images from low resolution ones, predicting which drug could treat a certain disease, retrieving images that contain a given pattern, etc.
The Statsbot team asked a data scientist, Anton Karazeev, to make the introduction to GANs engine and their applications in everyday life.
生成对抗网络(GANs)是属于无监督学习的一种神经网络。它们有助于解决按文本生成图像,提高图片分辨率,按药方抓药,检索特定模式的图片等任务.Statsbot小组请求数据科学家Anton Karazeev 在日常生活中引入GAN原理及其应用。

GANs were introduced by Ian Goodfellow in 2014. They aren’t the only approach of neural networks in unsupervised learning. There’s also the Boltzmann machine (Geoffrey Hinton and Terry Sejnowski, 1985) and Autoencoders (Dana H. Ballard, 1987). Both of them are dedicated to extract features from data by learning the identity function f(x) = x and both of them rely on Markov chains to train or to generate samples.
Gan由Ian Goodfellow在2014年提出。它们不是神经网络应用在无监督学习中的唯一途径。 还有玻尔兹曼机(Geoffrey Hinton和Terry Sejnowski,1985)和自动解码器(Dana H. Ballard,1987)。 他们都致力于通过学习恒等函数f(x)= x从数据中提取特征,并且它们都依赖马尔可夫链来训练或生成样本。

Generative adversarial networks were designed to avoid using Markov chains because of the high computational cost of the latter. Another advantage relative to Boltzmann machines is that the Generator function has much fewer restrictions (there are only a few probability distributions that admit Markov chain sampling).
生成对抗网络设计之初衷就是避免使用马尔可夫链,因为后者的计算成本很高。 相对于玻尔兹曼机的另一个优点是生成对抗网络的限制要少得多(只有几个概率分布适用于马尔可夫链抽样)。

In this article, we’ll tell you how generative adversarial nets work and what their most popular applications in real life are. We will also give you links to some helpful resources for getting deeper into these approaches.
The engine of generative adversarial nets
To explain GANs’ concept let us use an analogy.
在本文中,我们将告诉你生成对抗网络如何工作,以及现实生活中关于它们的最流行的应用程序是什么。我们还将向您提供一些深入了解这些方法的有用资源的链接。
生成对抗网络原理。
让我们用一个比喻解释 GAN 的原理吧。


Imagine you want to buy good watches. If you never buy them it’s very likely that you can’t distinguish brand watches from fake ones. You have to be experienced to not let yourself be fooled by the seller.
As you start to label most of the watches as fake (after a number of mistakes of course), the seller will start to “generate” more compelling copies of the watches. This example demonstrates the behavior of generative adversarial networks: Discriminator (watches buyer) and Generator (seller of fake watches).
现在想象一下你想买块好表。 如果你从未买过表,那么你很可能难辨真假。 你必须有买表的经验,以免被奸商欺骗。当你开始将大多数手表标记为假(当然是被骗之后),卖家将开始“生产”更逼真的山寨表。 这个例子形象地解释了生成对抗网络的基本原理:图像判别模型(手表买家)和图片生成模型(假手表的卖家)。

These two networks, Discriminator and Generator, are contesting with each other. This technique allows the generation of realistic objects (e.g. images). The Generator is forced to generate samples that look real and the Discriminator learns to distinguish generated samples and samples from real data.
图像判别模型和图片生成模型相互博弈。该技术允许生成现实对象(例如图像)。 图片生成模型强制生成看似真实的样本,图像判别模型学习分辨生成的样本和真实样本。
What’s the difference between discriminative and generative algorithms? In brief: discriminative algorithms learn the boundaries between classes (as the Discriminator does) while generative algorithms learn the distribution of classes (as the Generator does).
判别算法和生成算法有什么不同?简单地说:判别算法学习类之间的边界(就像图像判别模型做的那样),而生成算法学习类的分布(就像图片生成模型所做的那样)。

If you’re ready to go deeper
To learn the Generator’s distribution, p_g over data x, the distribution on input noise variables p_z(z) should be defined. Then G(z, θ_g) maps z from latent space Z to data space and D(x, θ_d) outputs a single scalar — probability that x came from the real data rather than p_g.
如果你准备好深入了解。
要想了解图片生成模型的分布,应该定义数据x的参数 p_g,以及输入噪声变量 p_z(z)的分布。 然后G(z,θ_g)将z从潜在空间Z映射到数据空间,D(x,θ_d)输出来自实数据而不是p_g的单个标量概率x。

The Discriminator is trained to maximize the probability of assigning the correct label to both examples of real data and generated samples. While the Generator is trained to minimize log(1 — D(G(z))). In other words — to minimize the probability of the Discriminator’s correct answer.
It is possible to consider such a training task as minimax game with value function V(G, D):
训练图像判别模型以最大化正确标注实际数据和生成样本的概率。训练图片生成模型用于最小化log(1-D(G(z)))。 换句话说 – 尽量减少图像判别模型得出正确答案的概率。
可以将这样的训练任务看作具有值函数V(G,D)的极大极小博弈:
In other words — the Generator tries harder to fool the Discriminator and the Discriminator becomes more captious in order to not be fooled by the Generator.
“Adversarial training is the coolest thing since sliced bread.” — Yann LeCun
The process of training stops when the Discriminator is unable to distinguish p_g and p_data, i.e. D(x, θ_d) = ½ . Equilibrium between errors of the Generator and the Discriminator is established.
换句话说–图片生成模型努力生成图像判别模型难以辨认的图片,图像判别模型也会愈加聪慧,以免被图片生成模型欺骗。
“对抗训练是继切片面包之后最酷的事情。” – Yann LeCun
当图像判别模型不能区分p_g和p_data,即D(x,θ_d)= 1/2时,训练过程停止。 达成图片生成模型和图像判别模型之间判定误差的平衡。

Image retrieval for historical archives
An interesting example of GANs applications is retrieving visually similar marks in “Prize Papers,” one of the most valuable archives in the field of maritime history. Adversarial nets make it easier to work with documents of historical importance containing information about the legitimacy of ship captures at sea.
历史档案图像检索
GANs应用程序的一个有趣的例子是在“Prize Papers”中检索相似的标记,Prize Papers 是海洋史上最具价值的档案之一。 对抗网络使得处理这些具有历史意义的文件更加容易,这些文件还包括海上扣留船只是否合法的信息。

Each query contains examples of Merchant Marks — unique identification of property of a merchant, sketch-like symbols that are similar to hieroglyphs.
每个查询到的记录都包含商家标记的样例- – 商家属性的唯一标识,类似于象形文字的草图样符号。

Feature representation of every mark should be obtained, but there are some problems of applying conventional machine and deep learning methods (including Convolutional neural networks):
我们应该获得每个标记的特征表示,但是应用常规机器学习和深度学习方法(包括卷积神经网络)存在一些问题:

they require a large amount of labelled images;
there are no labels for Merchant Marks;
marks are not segmented from the dataset.
This new approach shows how to extract and learn features from images of the Merchant Marks using GANs. After the representation of each mark is learned the visual search on scanned documents could be processed.
它们需要大量标注过的图像;
商标没有标注;
标记无法从数据集分割出去。
这种新方法显示了如何使用GANs 从商标的图像中提取和学习特征。 在学习每个标记的表示之后,就可以在扫描文档上的按图形搜索。

Text translation into images
Other researchers showed that it’s possible to use descriptive properties of natural language to generate corresponding images. A method of text translation into images allows the illustration of the performance of generative models to mimic samples of real data.
将文本翻译成图像
有研究人员表明,使用自然语言的描述属性生成相应的图像是可行的。文本转换成图像的方法可以说明生成模型模拟真实数据样本的性能。

The main problem of image generation is that image distribution is multimodal. For example, there are many correct samples that correctly illustrate the description. GANs help to solve this problem.
图片生成模型的主要问题在于每张图片不止包含一个模型。例如,有太多的例子完美的契合文本描述的内容。GANs可用于解决这一问题。
Let’s consider the following task of mapping the blue input dot to the green output dot (green dots are possible outputs to blue dot). This red arrow indicates the error of prediction and means that after some time the blue dot will be mapped to the mean of the green dots — this exact thing causes the blurry images we are trying to predict.
Generative adversarial nets don’t directly use pairs of inputs and outputs. Instead, they learn how the inputs and outputs can be paired.
Here are the examples of generated images from text descriptions:
我们来考虑以下任务:将蓝色输入点映射到绿色输出点(绿色点可能是蓝色点的输出)。这个红色箭头表示预测的误差,也意味着经过一段时间后,蓝色点将被映射到绿点的平均值——这一精确映射将会模糊我们试图预测的图像。
生成对抗网络不直接使用输入和输出对。相反地,它们学习如何给输入和输出配对。
下面是从文本描述中生成图像的示例:
Datasets that were used to train GANs:
Caltech-UCSD-200–2011 is an image dataset with photos of 200 bird species. Total number of images is 11,788;
Oxford-102 Flowers dataset consists of 102 flower categories with numbers between 40 and 258 images per category.
用于训练GAN的数据集:
Caltech-UCSD-200-2011是一个具有200种鸟类照片的总数为11,788的图像数据集。
Oxford-102 花数据集由102个花类别组成,每个类别包含40到258张图片不等。

Drug Discovery
While others apply generative adversarial networks to images and videos, researchers from Insilico Medicine proposed an approach of artificially intelligent drug discovery using GANs.
The goal is to train the Generator to sample drug candidates for a given disease as precisely as possible to existing drugs from a Drug Database.
按方抓药
当其它研究员应用生成对抗网络处理图片和视频时,Insilico 医学的研究人员提出了一种运用GANs的人工智能的按方抓药的方法。
我们的目标是训练图片生成模型,以尽可能精确地从一个药物数据库中对现有药物进行按病取药的操作。
After training, it’s possible to generate a drug for a previously incurable disease using the Generator, and using the Discriminator to determine whether the sampled drug actually cures the given disease.
经过训练后,可以使用生成模型产生一种以前不可治愈的疾病的药方,并使用判别模型来确定生成的药方是否确实治愈了给定的疾病。

Molecule development in oncology
Another research by Insilico Medicine showed the pipeline of generating new anticancer molecules with a defined set of parameters. The aim is to predict drug responses and compounds which are good at fighting against cancer cells.
Researchers proposed an Adversarial Autoencoder (AAE) model for identification and generation of new compounds based on available biochemical data.
肿瘤分子生物学的应用
Insilico 医学的另一个研究表明,产生一组按参数定义的新的抗癌分子的管道。其目的是预测具有抗癌作用的药物反应和化合物。
研究人员提出了一个基于现有生化数据的用于识别和生成新化合物的对抗自编码器(AAE)模型。
“To the best of our knowledge, this is the first application of GANs techniques within the field of cancer drug discovery.” — say the researchers.
There are many available biochemical data in databases such as Cancer Cell Line Encyclopedia (CCLE), Genomics of Drug Sensitivity in Cancer (GDSC), and NCI-60 cancer cell line collection. All of them contain screening data for different drug experiments against cancer.
“据我们所知,这是GANs技术在挖掘癌症药物领域的首个应用。” – 研究人员说。
数据库中有许多可用的生物化学数据,如癌细胞系百科全书(CCLE),肿瘤药物敏感基因学(GDSC)和NCI-60癌细胞系。 所有这些都包含针对癌症的不同药物实验的筛选数据。
Adversarial Autoencoder was trained using Growth Inhibition percentage data (GI, which shows the reduction in the number of cancer cells after the treatment), drug concentrations, and fingerprints as inputs.
The fingerprint of the molecule contains a fixed number of bits in which each bit represents the absence or presence of some feature.
对抗自编码器以药物浓度,和指纹作为输入并使用生长抑制率数据进行训练(GI,显示治疗后癌细胞的数量减少情况)。
分子的指纹在计算机中一个固定的位数表示,每一位代表某些特征的保留状态。
The latent layer consists of 5 neurons, one of which is responsible for GI (efficiency against cancer cells) and the four others are discriminated with normal distribution. So, a regression term was added to the Encoder cost function. Furthermore, the Encoder was restricted to map the same fingerprint to the same latent vector, independently from input concentration by additional manifold cost.
隐藏层由5个神经元组成,其中一个负责GI(癌细胞抑制率),另外四个由正态分布判别。 因此,一个回归项被添加到编码器代价函数中。 此外,编码器只能将相同的指纹映射到相同的潜在向量,这一过程独立于通过额外的流形代价集中输入。
After training, it is possible to generate molecules from a desired distribution and use a GI-neuron as a tuner of output compounds.
Results of this work are the following: the trained AAE model predicted compounds that are already proven to be anticancer agents and new untested compounds that should be validated with experiments on anticancer activity.
“Our results suggest that the proposed AAE model significantly enhances the capacity and efficiency of development of the new molecules with specific anticancer properties using the deep generative models.”
经过训练,网络可以从期望的分布中生成分子,并使用GI神经元作为输出化合物的微调器。
这项工作的成果如下:训练过的AAE 模型预测得到的化合物已经被证明是抗癌药物和需接受抗癌活性化合物实验验证的新药,。
“我们的研究结果表明,本文提出的AAE模型使用深度生成模型显著提高了特定抗癌能力和新分子的开发效率。”

Conclusion
结论
Unsupervised learning is a next frontier in artificial intelligence and we are moving towards it.
Generative adversarial nets can be applied in many fields from generating images to predicting drugs, so don’t be afraid of experimenting with them. We believe they help in building a better future for machine learning.
Below, we give you a few helpful resources to learn more about adversarial nets.
无监督学习是人工智能的下一个蓝海,我们正朝着这一方向迈进。
生成对抗网络可以应用于许多领域,从生成图像到预测药物,所以不要害怕失败。我们相信他们有助于建立一个更好的机器学习的未来。
下面,我们将提供一些有用的资源来了解更多有关对抗网络的信息。

Taken from “Generative Adversarial Nets”:
摘录自“生成对抗网络”:
GANs allow the model to learn that there are many correct answers (i.e. handling well on multimodal data);
semi-supervised learning: features from the Discriminator or inference net could improve performance of classifiers when limited labeled data is available;
one can use adversarial nets to implement a stochastic extension of the deterministic Multi-Prediction Deep Boltzmann Machines;
a conditional generative model p(x|c) can be obtained by adding c as the input to both the Generator and the Discriminator.
GANs允许模型有许多正确的答案(即处理多模数据);
半监督学习:当可用标记数据有限时,来自判别器或前馈网络的特征可以提高分类器的性能;
可以使用对抗网实现确定性多预测深玻尔兹曼机的随机扩展;
可以通过将c输入是生成器和判别器获得条件生成模型P(x | C)。

Further Reading
开拓视野

What is a Variational Autoencoder?
Ian Goodfellow about GANs for Text on Reddit
“StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks” by Baidu Research [github]
“Generative Visual Manipulation on the Natural Image Manifold” by Adobe Research [github]
“Unsupervised Cross-Domain Image Generation” by Facebook AI Research [github]
“Image-to-Image Translation with Conditional Adversarial Networks” by Berkeley AI Research [github]
什么是变体自动编码器?
Ian Goodfellow在Reddit上的关于GANs的文章
“StackGAN:百度研究的使用堆叠生成对抗网络实现的文本到照片合成效果”[github]
Adobe Research的“自然图像流形的生成视觉操作”[github]
Facebook AI研究所的“无监督跨域图像生成”[github]
伯克利AI研究所的“条件对抗网络的图像到图像的转换”

Share this to:

发表评论

电子邮件地址不会被公开。 必填项已用*标注