diff options
-rw-r--r-- | report/paper.md | 30 |
1 files changed, 15 insertions, 15 deletions
diff --git a/report/paper.md b/report/paper.md index 8ee2eb4..7b86846 100644 --- a/report/paper.md +++ b/report/paper.md @@ -64,7 +64,7 @@ Our medium depth DCGAN achieves very good performance (figure \ref{fig:dcmed}), As DCGAN is trained with no labels, the generator's primary objective is to output images that fool the discriminator, but does not intrinsically separate the classes from each another. Therefore we sometimes observe oddly shaped digits which may temporarily be labeled as real by the discriminator. This issue is solved by training the network for more batches or introducing a deeper architecture, as it can be deducted from a qualitative comparison between figures \ref{fig:dcmed}, \ref{fig:dcshort} and \ref{fig:dclong}. -Applying Virtual Batch Normalization our Medium DCGAN does not provide observable changes in G-D losses, but reduces within-batch correlation. Although it is difficult to qualitatively assess the improvements, figure \ref{fig:vbn_dc} shows results of the introduction of this technique. +Applying Virtual Batch Normalization on Medium DCGAN does not provide observable changes in G-D losses. Although it is difficult to qualitatively assess the improvements, figure \ref{fig:vbn_dc} shows results of the introduction of this technique. We evaluated the effect of different dropout rates (results in appendix figures \ref{fig:dcdrop1_1}, \ref{fig:dcdrop1_2}, \ref{fig:dcdrop2_1}, \ref{fig:dcdrop2_2}) and concluded that the optimisation of the dropout hyper-parameter is essential for maximising performance. A high dropout rate results in DCGAN producing only artifacts that do not match any specific class due to the generator performing better than the discriminator. Conversely a low dropout rate leads to an initial stabilisation of G-D losses, but ultimately results in instability under the form of oscillation when training for a large number of batches. @@ -73,19 +73,11 @@ Trying different parameters for artificial G-D balancing in the training stage d exclusively leading to the generation of more artifacts (figure \ref{fig:baldc}). We also attempted to increase the D training steps with respect to G, but no mode collapse was observed even with the shallow model. -\begin{figure} -\begin{center} -\includegraphics[width=12em]{fig/bal4.png} -\caption{DCGAN Balancing G-D; D/G=3} -\label{fig:baldc} -\end{center} -\end{figure} - # CGAN ## CGAN Architecture description -CGAN is a conditional version of a GAN which utilises labeled data. Unlike DCGAN, CGAN is trained with explicitly provided labels which allow CGAN to associate features with specific classes. The baseline CGAN which we evaluate is visible in figure \ref{fig:cganarc}. The baseline CGAN architecture presents a series of blocks, each containing a dense layer, `LeakyReLU` layer (`slope=0.2`) and a Batch Normalisation layer. The baseline discriminator uses Dense layers, followed by `LeakyReLU` (`slope=0.2`) and a Droupout layer. +CGAN is a conditional version of a GAN which utilises labeled data. Unlike DCGAN, CGAN is trained with explicitly provided labels which allow CGAN to associate features with specific classes. The baseline CGAN which we evaluate is visible in figure \ref{fig:cganarc}. The generator's architecture presents a series of blocks, each containing a dense layer, `LeakyReLU` layer (`slope=0.2`) and a Batch Normalisation layer. The baseline discriminator uses Dense layers, followed by `LeakyReLU` (`slope=0.2`) and a Droupout layer. The optimizer used for training is `Adam`(`learning_rate=0.002`, `beta=0.5`). The architecture of the Deep Convolutional CGAN (cDCGAN) analysed is presented in the Appendix. It uses transpose convolutions with a stride of two to perform upscaling followed by convolutional blocks with singular stride. We find that kernel size of 3 by 3 worked well for all four convolutional blocks which include a Batch Normalization and an Activation layer (`ReLU` for generator and `LeakyReLU` for discriminator). The architecture assessed in this paper uses multiplying layers between the label embedding and the output `ReLU` blocks, as we found that it was more robust compared to the addition of the label embedding via concatenation. Label embedding @@ -143,9 +135,9 @@ Performance results for one-sided labels smoothing with `true_labels = 0.9` are \end{figure} Virtual Batch normalization provides results that are difficult to qualitatively assess when compared to the ones obtained through the baseline. -Applying this technique to both the CGAN architectures used keeps G-D losses -mostly unchanged. The biggest change we expect to see is a lower correlation between images in the same batch. We expect this aspect to mostly affect -performance when training a classifier with the generated images from CGAN, as we will obtain more diverse images. Training with a larger batch size +Applying this technique to Medium CGAN keeps G-D losses +mostly unchanged. The biggest change we expect to see is a lower dependence of the output on the individual batches. We expect this aspect to mostly affect +performance when training a classifier with the generated images from CGAN, as we will generate more robust output samples. Training with a larger batch size would result in even more difficult changes to observe, but since we set `batch_size=128` we expect to see clearer results when performing quantitative measurements. Similarly to DCGAN, changing the G-D steps did not lead to good quality results as it can be seen in figure \ref{fig:cbalance}, in which we tried to train @@ -209,7 +201,7 @@ tend to collapse to very small regions. Virtual Batch Normalization on this architecture was not attempted as it significantly increased the training time (about twice more). Introducing one-sided label smoothing produced very similar results (figure \ref{fig:cdcsmooth}), hence a quantitative performance assessment will need to -be performed in the next section to state which ones are better(through Inception Scores). +be performed in the next section to state which ones are better (through Inception Scores). # Inception Score @@ -252,7 +244,7 @@ One sided label smoothing involves relaxing our confidence on the labels in our ### Virtual Batch Normalisation -Virtual Batch Normalisation is a further optimisation technique proposed by Tim Salimans et. al. [@improved]. Virtual batch normalisation is a modification to the batch normalisation layer, which performs normalisation based on statistics from a reference batch. We observe that VBN improved the classification accuracy and the Inception score due to the provided reduction in intra-batch correlation. +Virtual Batch Normalisation is a further optimisation technique proposed by Tim Salimans et. al. [@improved]. Virtual batch normalisation is a modification to the batch normalisation layer, which performs normalisation based on statistics from a reference batch. We observe that VBN improved the classification accuracy and the Inception score due to the provided reduction in output dependency from the individual batches, ultimately resulting in a higher samples' quality. ### Dropout @@ -461,6 +453,14 @@ $$ L_{\textrm{total}} = \alpha L_{\textrm{LeNet}} + \beta L_{\textrm{generator}} \end{center} \end{figure} +\begin{figure} +\begin{center} +\includegraphics[width=12em]{fig/bal4.png} +\caption{DCGAN Balancing G-D; D/G=3} +\label{fig:baldc} +\end{center} +\end{figure} + ## CGAN-Appendix \begin{figure}[H] |