diff options
Diffstat (limited to 'report')
-rw-r--r-- | report/paper.md | 170 |
1 files changed, 158 insertions, 12 deletions
diff --git a/report/paper.md b/report/paper.md index b76ba5b..f104cf0 100644 --- a/report/paper.md +++ b/report/paper.md @@ -103,7 +103,7 @@ While training the different proposed DCGAN architectures, we did not observe mo CGAN is a conditional version of a GAN which utilises labeled data. Unlike DCGAN, CGAN is trained with explicitly provided labels which allow CGAN to associate features with specific labels. This has the intrinsic advantage of allowing us to specify the label of generated data. The baseline CGAN which we evaluate is visible in figure \ref{fig:cganarc}. The baseline CGAN architecture presents a series blocks each contained a dense layer, LeakyReLu layer (slope=0.2) and a Batch Normalisation layer. The baseline discriminator uses Dense layers, followed by LeakyReLu (slope=0.2) and a Droupout layer. The optimizer used for training is `Adam`(`learning_rate=0.002`, `beta=0.5`). -The Convolutional CGAN analysed follows a structure similar to DCGAN and is presented in figure \ref{fig:cdcganarc}. +The Convolutional CGAN analysed follows the structure presented in the relevant Appendix section. It uses TODO ADD BRIEF DESCRIPTION We evaluate permutations of the architecture involving: @@ -175,13 +175,26 @@ the same classes, indicating that mode collapse still did not occur. \end{center} \end{figure} -FIX THIS -Convolutional CGAN did not achieve better results than our baseline approach for the architecture analyzed, although we believe that -it is possible to achieve a better performance by finer tuning of the Convolutional CGAN parameters. Figure \ref{fig:cdcloss} shows a very high oscillation -of the generator loss, hence the image quality varies a lot at each training step. Attempting LS on this architecture achieved a similar outcome -when compared to the non-convolutional counterpart. +The best performing architecture was Convolutional CGAN. It is difficult to assess any potential improvement at this stage, since the samples produced +after around 10,000 batches are indistinguishable from the ones of the MNIST dataset (as it can be seen in figure \ref{fig:cdc}). Training CDCGAN for more than +15,000 batches is however not beneficial, as the discriminator will keep improving, leading the generator to produce bad samples as shown in the reported example. +We find a good balance for 12,000 batches. + +\begin{figure} +\begin{center} +\includegraphics[width=8em]{fig/cdc1.png} +\includegraphics[width=8em]{fig/cdc2.png} +\includegraphics[width=8em]{fig/cdc3.png} +\caption{CDCGAN outputs; 1000 batches - 12000 batches - 20000 batches} +\label{fig:cbalance} +\end{center} +\end{figure} -ADD PIC + +Virtual Batch Normalization on this architecture was not attempted as it significantly +increased the training time (about twice more). +Introducing one-sided label smoothing produced very similar results, hence a quantitative performance assessment will need to +be performed in the next section through the introduction of Inception Scores. # Inception Score @@ -199,9 +212,9 @@ calculated training the LeNet classifier under the same conditions across all ex Shallow CGAN & 0.645 & 3.57 & 8:14 \\ Medium CGAN & 0.715 & 3.79 & 10:23 \\ Deep CGAN & 0.739 & 3.85 & 16:27 \\ -Convolutional CGAN & 0.737 & 4 & 25:27 \\ +Convolutional CGAN & 0.899 & 7.41 & 1:05:27 \\ Medium CGAN+LS & 0.749 & 3.643 & 10:42 \\ -Convolutional CGAN+LS & 0.601 & 2.494 & 27:36 \\ +Convolutional CGAN+LS & & & 1:12:39 \\ Medium CGAN DO=0.1 & 0.761 & 3.836 & 10:36 \\ Medium CGAN DO=0.5 & 0.725 & 3.677 & 10:36 \\ Medium CGAN+VBN & 0.735 & 3.82 & 19:38 \\ @@ -222,7 +235,7 @@ One sided label smoothing involves relaxing our confidence on the labels in our ### Virtual Batch Normalisation -Virtual Batch Noramlisation is a further optimisation technique proposed by Tim Salimans et. al. [@improved]. Virtual batch normalisation is a modification to the batch normalisation layer, which performs normalisation based on statistics from a reference batch. We observe that VBN improved the classification accuracy and the Inception score. TODO EXPLAIN WHY +Virtual Batch Noramlisation is a further optimisation technique proposed by Tim Salimans et. al. [@improved]. Virtual batch normalisation is a modification to the batch normalisation layer, which performs normalisation based on statistics from a reference batch. We observe that VBN improved the classification accuracy and the Inception score due to the provided reduction in intra-batch correlation. ### Dropout @@ -316,8 +329,10 @@ TODO EXPLAIN WHAT WE HAVE DONE HERE \subfloat[][]{\includegraphics[width=.2\textwidth]{fig/pca-mnist.png}}\quad \subfloat[][]{\includegraphics[width=.2\textwidth]{fig/tsne-mnist.png}}\\ \subfloat[][]{\includegraphics[width=.2\textwidth]{fig/pca-cgan.png}}\quad - \subfloat[][]{\includegraphics[width=.2\textwidth]{fig/tsne-cgan.png}} - \caption{Visualisations: a)MNIST|PCA b)MNIST|TSNE c)CGAN-gen|PCA d)CGAN-gen|TSNE} + \subfloat[][]{\includegraphics[width=.2\textwidth]{fig/tsne-cgan.png}}\\ + \subfloat[][]{\includegraphics[width=.2\textwidth]{fig/pca-cdc.png}}\quad + \subfloat[][]{\includegraphics[width=.2\textwidth]{fig/tsne-cdc.png}} + \caption{Visualisations: a)MNIST|PCA b)MNIST|TSNE c)CGAN-gen|PCA d)CGAN-gen|TSNE e)CDCGAN-gen|PCA f)CDCGAN-gen|TSNE} \label{fig:features} \end{figure} @@ -492,6 +507,137 @@ $$ L_{\textrm{total}} = \alpha L_{\textrm{LeNet}} + \beta L_{\textrm{generator}} \end{center} \end{figure} +## CDCGAN Alternative Architecture + +### Generator +``` +__________________________________________________________________________________________________ +Layer (type) Output Shape Param # Connected to +================================================================================================== +input_1 (InputLayer) (None, 100) 0 +__________________________________________________________________________________________________ +dense_2 (Dense) (None, 3136) 316736 input_1[0][0] +__________________________________________________________________________________________________ +reshape_2 (Reshape) (None, 7, 7, 64) 0 dense_2[0][0] +__________________________________________________________________________________________________ +conv2d_transpose_1 (Conv2DTrans (None, 14, 14, 64) 36928 reshape_2[0][0] +__________________________________________________________________________________________________ +batch_normalization_1 (BatchNor (None, 14, 14, 64) 256 conv2d_transpose_1[0][0] +__________________________________________________________________________________________________ +activation_1 (Activation) (None, 14, 14, 64) 0 batch_normalization_1[0][0] +__________________________________________________________________________________________________ +input_2 (InputLayer) (None, 1) 0 +__________________________________________________________________________________________________ +conv2d_transpose_2 (Conv2DTrans (None, 28, 28, 64) 36928 activation_1[0][0] +__________________________________________________________________________________________________ +dense_1 (Dense) (None, 64) 128 input_2[0][0] +__________________________________________________________________________________________________ +batch_normalization_2 (BatchNor (None, 28, 28, 64) 256 conv2d_transpose_2[0][0] +__________________________________________________________________________________________________ +reshape_1 (Reshape) (None, 1, 1, 64) 0 dense_1[0][0] +__________________________________________________________________________________________________ +activation_2 (Activation) (None, 28, 28, 64) 0 batch_normalization_2[0][0] +__________________________________________________________________________________________________ +up_sampling2d_1 (UpSampling2D) (None, 28, 28, 64) 0 reshape_1[0][0] +__________________________________________________________________________________________________ +multiply_1 (Multiply) (None, 28, 28, 64) 0 activation_2[0][0] + up_sampling2d_1[0][0] +__________________________________________________________________________________________________ +conv2d_1 (Conv2D) (None, 28, 28, 64) 36928 multiply_1[0][0] +__________________________________________________________________________________________________ +batch_normalization_3 (BatchNor (None, 28, 28, 64) 256 conv2d_1[0][0] +__________________________________________________________________________________________________ +activation_3 (Activation) (None, 28, 28, 64) 0 batch_normalization_3[0][0] +__________________________________________________________________________________________________ +multiply_2 (Multiply) (None, 28, 28, 64) 0 activation_3[0][0] + up_sampling2d_1[0][0] +__________________________________________________________________________________________________ +conv2d_2 (Conv2D) (None, 28, 28, 64) 36928 multiply_2[0][0] +__________________________________________________________________________________________________ +batch_normalization_4 (BatchNor (None, 28, 28, 64) 256 conv2d_2[0][0] +__________________________________________________________________________________________________ +activation_4 (Activation) (None, 28, 28, 64) 0 batch_normalization_4[0][0] +__________________________________________________________________________________________________ +multiply_3 (Multiply) (None, 28, 28, 64) 0 activation_4[0][0] + up_sampling2d_1[0][0] +__________________________________________________________________________________________________ +conv2d_3 (Conv2D) (None, 28, 28, 1) 577 multiply_3[0][0] +__________________________________________________________________________________________________ +activation_5 (Activation) (None, 28, 28, 1) 0 conv2d_3[0][0] +================================================================================================== +Total params: 466,177 +Trainable params: 465,665 +Non-trainable params: 512 +__________________________________________________________________________________________________ +``` + +### Discriminator + +``` +__________________________________________________________________________________________________ +Layer (type) Output Shape Param # Connected to +================================================================================================== +input_3 (InputLayer) (None, 28, 28, 1) 0 +__________________________________________________________________________________________________ +input_2 (InputLayer) (None, 1) 0 +__________________________________________________________________________________________________ +conv2d_4 (Conv2D) (None, 28, 28, 64) 640 input_3[0][0] +__________________________________________________________________________________________________ +dense_3 (Dense) (None, 64) 128 input_2[0][0] +__________________________________________________________________________________________________ +batch_normalization_5 (BatchNor (None, 28, 28, 64) 256 conv2d_4[0][0] +__________________________________________________________________________________________________ +reshape_3 (Reshape) (None, 1, 1, 64) 0 dense_3[0][0] +__________________________________________________________________________________________________ +leaky_re_lu_1 (LeakyReLU) (None, 28, 28, 64) 0 batch_normalization_5[0][0] +__________________________________________________________________________________________________ +up_sampling2d_2 (UpSampling2D) (None, 28, 28, 64) 0 reshape_3[0][0] +__________________________________________________________________________________________________ +multiply_4 (Multiply) (None, 28, 28, 64) 0 leaky_re_lu_1[0][0] + up_sampling2d_2[0][0] +__________________________________________________________________________________________________ +conv2d_5 (Conv2D) (None, 28, 28, 64) 36928 multiply_4[0][0] +__________________________________________________________________________________________________ +batch_normalization_6 (BatchNor (None, 28, 28, 64) 256 conv2d_5[0][0] +__________________________________________________________________________________________________ +leaky_re_lu_2 (LeakyReLU) (None, 28, 28, 64) 0 batch_normalization_6[0][0] +__________________________________________________________________________________________________ +multiply_5 (Multiply) (None, 28, 28, 64) 0 leaky_re_lu_2[0][0] + up_sampling2d_2[0][0] +__________________________________________________________________________________________________ +conv2d_6 (Conv2D) (None, 28, 28, 64) 36928 multiply_5[0][0] +__________________________________________________________________________________________________ +batch_normalization_7 (BatchNor (None, 28, 28, 64) 256 conv2d_6[0][0] +__________________________________________________________________________________________________ +leaky_re_lu_3 (LeakyReLU) (None, 28, 28, 64) 0 batch_normalization_7[0][0] +__________________________________________________________________________________________________ +multiply_6 (Multiply) (None, 28, 28, 64) 0 leaky_re_lu_3[0][0] + up_sampling2d_2[0][0] +__________________________________________________________________________________________________ +conv2d_7 (Conv2D) (None, 14, 14, 64) 36928 multiply_6[0][0] +__________________________________________________________________________________________________ +batch_normalization_8 (BatchNor (None, 14, 14, 64) 256 conv2d_7[0][0] +__________________________________________________________________________________________________ +leaky_re_lu_4 (LeakyReLU) (None, 14, 14, 64) 0 batch_normalization_8[0][0] +__________________________________________________________________________________________________ +conv2d_8 (Conv2D) (None, 7, 7, 64) 36928 leaky_re_lu_4[0][0] +__________________________________________________________________________________________________ +batch_normalization_9 (BatchNor (None, 7, 7, 64) 256 conv2d_8[0][0] +__________________________________________________________________________________________________ +leaky_re_lu_5 (LeakyReLU) (None, 7, 7, 64) 0 batch_normalization_9[0][0] +__________________________________________________________________________________________________ +flatten_1 (Flatten) (None, 3136) 0 leaky_re_lu_5[0][0] +__________________________________________________________________________________________________ +dropout_1 (Dropout) (None, 3136) 0 flatten_1[0][0] +__________________________________________________________________________________________________ +dense_4 (Dense) (None, 1) 3137 dropout_1[0][0] +================================================================================================== +Total params: 152,897 +Trainable params: 152,257 +Non-trainable params: 640 +__________________________________________________________________________________________________ +``` + ## Retrain-Appendix \begin{figure}[H] |