aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--report/fig/cgan_dropout01.pngbin0 -> 19085 bytes
-rw-r--r--report/fig/cgan_dropout01_ex.pngbin0 -> 14640 bytes
-rw-r--r--report/fig/cgan_dropout05.pngbin0 -> 20612 bytes
-rw-r--r--report/fig/cgan_dropout05_ex.pngbin0 -> 14018 bytes
-rw-r--r--report/paper.md109
5 files changed, 71 insertions, 38 deletions
diff --git a/report/fig/cgan_dropout01.png b/report/fig/cgan_dropout01.png
new file mode 100644
index 0000000..450deaf
--- /dev/null
+++ b/report/fig/cgan_dropout01.png
Binary files differ
diff --git a/report/fig/cgan_dropout01_ex.png b/report/fig/cgan_dropout01_ex.png
new file mode 100644
index 0000000..2bbf777
--- /dev/null
+++ b/report/fig/cgan_dropout01_ex.png
Binary files differ
diff --git a/report/fig/cgan_dropout05.png b/report/fig/cgan_dropout05.png
new file mode 100644
index 0000000..0fe282f
--- /dev/null
+++ b/report/fig/cgan_dropout05.png
Binary files differ
diff --git a/report/fig/cgan_dropout05_ex.png b/report/fig/cgan_dropout05_ex.png
new file mode 100644
index 0000000..b9f83fd
--- /dev/null
+++ b/report/fig/cgan_dropout05_ex.png
Binary files differ
diff --git a/report/paper.md b/report/paper.md
index 02a689b..7a26e55 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -7,29 +7,11 @@ Generative Adversarial Networks present a system of models which learn to output
GAN's employ two neural networks - a *discriminator* and a *generator* which contest in a zero-sum game. The task of the *discriminator* is to distinguish generated images from real images, while the task of the generator is to produce realistic images which are able to fool the discriminator.
-### Mode Collapse
-
Training a shallow GAN with no convolutional layers poses multiple problems: mode collapse and generating low quality images due to unbalanced G-D losses.
Mode collapse can be observed in figure \ref{fig:mode_collapse}, after 200.000 iterations of the GAN network presented in appendix, figure \ref{fig:vanilla_gan} . The output of the generator only represents few of the labels originally fed. At that point the loss function of the generator stops
improving as shown in figure \ref{fig:vanilla_loss}. We observe, the discriminator loss tentding to zero as it learns ti classify the fake 1's, while the generator is stuck producing 1's.
-\begin{figure}
-\begin{center}
-\includegraphics[width=24em]{fig/generic_gan_loss.png}
-\caption{Shallow GAN D-G Loss}
-\label{fig:vanilla_loss}
-\end{center}
-\end{figure}
-
-\begin{figure}
-\begin{center}
-\includegraphics[width=24em]{fig/generic_gan_mode_collapse.pdf}
-\caption{Shallow GAN mode collapse}
-\label{fig:mode_collapse}
-\end{center}
-\end{figure}
-
A significant improvement to this vanilla architecture is Deep Convolutional Generative Adversarial Networks (DCGAN).
# DCGAN
@@ -66,15 +48,6 @@ We propose 3 different architectures, varying the size of convolutional layers i
\begin{figure}
\begin{center}
-\includegraphics[width=24em]{fig/short_dcgan_ex.pdf}
-\includegraphics[width=24em]{fig/short_dcgan.png}
-\caption{Shallow DCGAN}
-\label{fig:dcshort}
-\end{center}
-\end{figure}
-
-\begin{figure}
-\begin{center}
\includegraphics[width=24em]{fig/med_dcgan_ex.pdf}
\includegraphics[width=24em]{fig/med_dcgan.png}
\caption{Medium DCGAN}
@@ -82,15 +55,6 @@ We propose 3 different architectures, varying the size of convolutional layers i
\end{center}
\end{figure}
-\begin{figure}
-\begin{center}
-\includegraphics[width=24em]{fig/long_dcgan_ex.pdf}
-\includegraphics[width=24em]{fig/long_dcgan.png}
-\caption{Deep DCGAN}
-\label{fig:dclong}
-\end{center}
-\end{figure}
-
It is possible to notice that using deeper architectures it is possible to balance G-D losses more easilly. Medium DCGAN achieves a very good performance,
balancing both binary cross entropy losses ar around 1 after 5.000 epochs, showing significantly lower oscillation for longer training even when compared to
Deep DCGAN.
@@ -98,7 +62,7 @@ Deep DCGAN.
Since we are training with no labels, the generator will simply try to output images that fool the discriminator, but do not directly map to one specific class.
Examples of this can be observed for all the output groups reported above as some of the shapes look very odd (but smooth enough to be labelled as real). This
specific issue is solved by training the network for more epochs or introducing a deeper architecture, as it can be deducted from a qualitative comparison
-between figures \ref{fig:dcshort}, \ref{fig:dcmed} and \ref{fig:dclong}.
+between figures \ref{fig:dcmed}, \ref{fig:dcshort} and \ref{fig:dclong}.
Applying Virtual Batch Normalization on Medium DCGAN does not provide observable changes in G-D balancing, but reduces within-batch correlation. Although it
is difficult to qualitatively assess the improvements, figure \ref{fig:vbn_dc} shows results of the introduction of this technique.
@@ -111,7 +75,7 @@ is difficult to qualitatively assess the improvements, figure \ref{fig:vbn_dc} s
\end{center}
\end{figure}
-We evaluated the effect of different dropout rates (results in appendix, figures \ref{dcdrop1_1}, \ref{dcdrop1_2}, \ref{dcdrop2_1}, \ref{dcdrop2_2}) and concluded that the optimization
+We evaluated the effect of different dropout rates (results in appendix, figures \ref{fig:dcdrop1_1}, \ref{fig:dcdrop1_2}, \ref{fig:dcdrop2_1}, \ref{fig:dcdrop2_2}) and concluded that the optimization
of this parameter is essential to obtain good performance: a high dropout rate would result in DCGAN producing only artifacts that do not really match any specific class due to the generator performing better than the discriminator. Conversely a low dropout rate would lead to an initial stabilisation of G-D losses, but it would result into oscillation when training for a large number of epochs.
While training the different proposed DCGAN architectures, we did not observe mode collapse, confirming that the architecture used performed better than
@@ -128,6 +92,8 @@ smoothing**, **virtual batch normalization**, balancing G and D.
Please perform qualitative analyses on the generated images, and discuss, with results, what
challenge and how they are specifically addressing. Is there the **mode collapse issue?**
+The effect of dropout for the non-convolutional CGAN architecture does not affect performance as much as in DCGAN, as the images produced, together with the G-D loss remain almost unchanged. Results are presented in figures \ref{fig:cg_drop1_1}, \ref{fig:cg_drop1_2}, \ref{fig:cg_drop2_1}, \ref{fig:cg_drop2_2}.
+
# Inception Score
@@ -225,6 +191,22 @@ architecture and loss function?
\begin{figure}
\begin{center}
+\includegraphics[width=24em]{fig/generic_gan_loss.png}
+\caption{Shallow GAN D-G Loss}
+\label{fig:vanilla_loss}
+\end{center}
+\end{figure}
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=24em]{fig/generic_gan_mode_collapse.pdf}
+\caption{Shallow GAN mode collapse}
+\label{fig:mode_collapse}
+\end{center}
+\end{figure}
+
+\begin{figure}
+\begin{center}
\includegraphics[width=24em]{fig/dcgan_dropout01_gd.png}
\caption{DCGAN Dropout 0.1 G-D Losses}
\label{fig:dcdrop1_1}
@@ -254,3 +236,54 @@ architecture and loss function?
\label{fig:dcdrop2_2}
\end{center}
\end{figure}
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=24em]{fig/short_dcgan_ex.pdf}
+\includegraphics[width=24em]{fig/short_dcgan.png}
+\caption{Shallow DCGAN}
+\label{fig:dcshort}
+\end{center}
+\end{figure}
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=24em]{fig/long_dcgan_ex.pdf}
+\includegraphics[width=24em]{fig/long_dcgan.png}
+\caption{Deep DCGAN}
+\label{fig:dclong}
+\end{center}
+\end{figure}
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=24em]{fig/cgan_dropout01.png}
+\caption{CGAN Dropout 0.1 G-D Losses}
+\label{fig:cg_drop1_1}
+\end{center}
+\end{figure}
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=14em]{fig/cgan_dropout01_ex.png}
+\caption{CGAN Dropout 0.1 Generated Images}
+\label{fig:cg_drop1_2}
+\end{center}
+\end{figure}
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=24em]{fig/cgan_dropout05.png}
+\caption{CGAN Dropout 0.5 G-D Losses}
+\label{fig:cg_drop2_1}
+\end{center}
+\end{figure}
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=14em]{fig/cgan_dropout05_ex.png}
+\caption{CGAN Dropout 0.5 Generated Images}
+\label{fig:cg_drop2_2}
+\end{center}
+\end{figure}
+