aboutsummaryrefslogtreecommitdiff
path: root/report
diff options
context:
space:
mode:
authornunzip <np.scarh@gmail.com>2019-03-14 18:54:02 +0000
committernunzip <np.scarh@gmail.com>2019-03-14 18:54:02 +0000
commit5f6549eb38ba92762e7f4419deeb6a9b6284ebf3 (patch)
tree518aab94d16139f6f1d20e7c3e9f7798db266e0f /report
parent181a1934631dc353c3e907bcd05cade154183b68 (diff)
downloade4-gan-5f6549eb38ba92762e7f4419deeb6a9b6284ebf3.tar.gz
e4-gan-5f6549eb38ba92762e7f4419deeb6a9b6284ebf3.tar.bz2
e4-gan-5f6549eb38ba92762e7f4419deeb6a9b6284ebf3.zip
Add mode collapse to CDCGAN
Diffstat (limited to 'report')
-rw-r--r--report/paper.md42
1 files changed, 33 insertions, 9 deletions
diff --git a/report/paper.md b/report/paper.md
index 2ba5401..2177177 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -175,19 +175,34 @@ We find a good balance for 12,000 batches.
\end{center}
\end{figure}
-Oscillation on the generator loss is noticeable in figure {fig:cdcloss} due to the discriminator loss approaching zero. One possible
+Oscillation on the generator loss is noticeable in figure \ref{fig:cdcloss} due to the discriminator loss approaching zero. One possible
adjustment to tackle this issue was balancing G-D training steps, opting for G/D=3, allowing the generator to gain some advantage over the discriminator. This
technique allowed to smooth oscillation while producing images of similar quality. A quantitative performance assessment will be performed in the following section.
+Using G/D=6 dampens oscillation almost completely leading to the vanishing discriminator's gradient issue. Mode collapse occurs in this specific case as shown on
+figure \ref{fig:cdccollapse}. Checking the embeddings extracted from a pretrained LeNet classifier (figure \ref{fig:clustcollapse})we observe low diversity between features of each class, that
+tend to collapse to very small regions.
\begin{figure}
\begin{center}
-\includegraphics[width=12em]{fig/cdcloss1.png}
-\includegraphics[width=12em]{fig/cdcloss2.png}
-\caption{CDCGAN G-D loss; Left G/D=1; Right G/D=3}
+\includegraphics[width=8em]{fig/cdcloss1.png}
+\includegraphics[width=8em]{fig/cdcloss2.png}
+\includegraphics[width=8em]{fig/cdcloss3.png}
+\caption{CDCGAN G-D loss; Left G/D=1; Middle G/D=3; Right G/D=6}
\label{fig:cdcloss}
\end{center}
\end{figure}
+\begin{figure}
+\begin{center}
+\includegraphics[width=8em]{fig/cdc_collapse.png}
+\includegraphics[width=8em]{fig/cdc_collapse.png}
+\includegraphics[width=8em]{fig/cdc_collapse.png}
+\caption{CDCGAN G/D=6 mode collapse}
+\label{fig:cdccollapse}
+\end{center}
+\end{figure}
+
+
Virtual Batch Normalization on this architecture was not attempted as it significantly
increased the training time (about twice more).
Introducing one-sided label smoothing produced very similar results (figure \ref{fig:cdcsmooth}), hence a quantitative performance assessment will need to
@@ -209,10 +224,11 @@ calculated training the LeNet classifier under the same conditions across all ex
Shallow CGAN & 0.645 & 3.57 & 8:14 \\
Medium CGAN & 0.715 & 3.79 & 10:23 \\
Deep CGAN & 0.739 & 3.85 & 16:27 \\
-\textbf{CDCGAN} & \textbf{0.899} & \textbf{7.41} & 1:05:27 \\
+\textbf{CDCGAN} & \textbf{0.899} & \textbf{7.41} & 1:05:27 \\
Medium CGAN+LS & 0.749 & 3.643 & 10:42 \\
-CDCGAN+LS & 0.846 & 6.63 & 1:12:39 \\
-CCGAN-G/D=3 & 0.849 & 6.59 & 1:04:11 \\
+CDCGAN+LS & 0.846 & 6.63 & 1:12:39 \\
+CCGAN-G/D=3 & 0.849 & 6.59 & 48:11 \\
+CCGAN-G/D=6 & 0.801 & 6.06 & 36:05 \\
Medium CGAN DO=0.1 & 0.761 & 3.836 & 10:36 \\
Medium CGAN DO=0.5 & 0.725 & 3.677 & 10:36 \\
Medium CGAN+VBN & 0.735 & 3.82 & 19:38 \\
@@ -268,7 +284,7 @@ As observed in figure \ref{fig:mix1} we performed two experiments for performanc
\end{center}
\end{figure}
-Both experiments show that training the classification network with the injection of generated data (between 40% and 90%) causes on average a small increase in accuracy of up to 0.2%. In absence of original data the testing accuracy drops significantly to around 40% for both cases.
+Both experiments show that training the classification network with the injection of generated data (between 40% and 90%) causes on average a small increase in accuracy of up to 0.2%. In absence of original data the testing accuracy drops significantly to around 40% for both cases.
## Adapted Training Strategy
@@ -498,7 +514,15 @@ $$ L_{\textrm{total}} = \alpha L_{\textrm{LeNet}} + \beta L_{\textrm{generator}}
\end{center}
\end{figure}
-\begin{figure}
+\begin{figure}[H]
+\begin{center}
+\includegraphics[width=18em]{fig/clustcollapse.png}
+\caption{CDCGAN G/D=6 Embeddings through LeNet}
+\label{fig:clustcollapse}
+\end{center}
+\end{figure}
+
+\begin{figure}[H]
\begin{center}
\includegraphics[width=8em]{fig/cdcsmooth.png}
\caption{CDCGAN+LS outputs 12000 batches}