From 9b614419410971920b439be141c1ebfebb8c10fd Mon Sep 17 00:00:00 2001 From: nunzip Date: Fri, 15 Mar 2019 00:21:13 +0000 Subject: Fix Architecture part and VBN part --- report/paper.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) (limited to 'report') diff --git a/report/paper.md b/report/paper.md index b4e812e..3917bd9 100644 --- a/report/paper.md +++ b/report/paper.md @@ -88,7 +88,8 @@ but no mode collapse was observed even with the shallow model. CGAN is a conditional version of a GAN which utilises labeled data. Unlike DCGAN, CGAN is trained with explicitly provided labels which allow CGAN to associate features with specific classes. The baseline CGAN which we evaluate is visible in figure \ref{fig:cganarc}. The baseline CGAN architecture presents a series of blocks, each containing a dense layer, `LeakyReLu` layer (`slope=0.2`) and a Batch Normalisation layer. The baseline discriminator uses Dense layers, followed by `LeakyReLu` (`slope=0.2`) and a Droupout layer. The optimizer used for training is `Adam`(`learning_rate=0.002`, `beta=0.5`). -The architecture of the Deep Convolutional CGAN (cDCGAN) analysed is presented in the Appendix. It uses transpose convolutions with a stride of two to perform upscaling followed by convolutional bloks with singular stride. We find that kernel size of 3 by 3 worked well for all four convolutional blocks which include a Batch Normalization and an Activation layer. The architecture assessed in this paper uses multiplying layers to multiply the label embedding with the output `ReLu` blocks, as we found that it was more robust compared to addition of the label embedding via concatenation. +The architecture of the Deep Convolutional CGAN (cDCGAN) analysed is presented in the Appendix. It uses transpose convolutions with a stride of two to perform upscaling followed by convolutional blocks with singular stride. We find that kernel size of 3 by 3 worked well for all four convolutional blocks which include a Batch Normalization and an Activation layer (`ReLu` for generator and `LeakyReLu` for discriminator). The architecture assessed in this paper uses multiplying layers between the label embedding and the output `ReLu` blocks, as we found that it was more robust compared to the addition of the label embedding via concatenation. Label embedding +is performed with a `Dense+Tanh+Upsampling` block, both in the discriminator and the generator, feeding a 64x28x28 input for the multiplication layers. The list of the architecture we evaluate in this report: @@ -141,10 +142,11 @@ Performance results for one-sided labels smoothing with `true_labels = 0.9` are \end{center} \end{figure} -Virtual Batch normalization does not affect performance significantly. Applying this technique to both the CGAN architectures used keeps G-D losses -mostly unchanged. The biggest change we expect to see is a lower correlation between images in the same batch. This aspect will mostly affect +Virtual Batch normalization provides results that are difficult to qualitatively assess when compared to the ones obtained through the baseline. +Applying this technique to both the CGAN architectures used keeps G-D losses +mostly unchanged. The biggest change we expect to see is a lower correlation between images in the same batch. We expect this aspect to mostly affect performance when training a classifier with the generated images from CGAN, as we will obtain more diverse images. Training with a larger batch size -would show more significant results, but since we set this parameter to 128 the issue of within-batch correlation is limited. +would result in even more difficult changes to observe, but since we set `batch_size=128` we expect to see clearer results when performing quantitative measurements. Similarly to DCGAN, changing the G-D steps did not lead to good quality results as it can be seen in figure \ref{fig:cbalance}, in which we tried to train with D/G=15 for 10,000 batches, trying to initialize good discriminator weights, to then revert to a D/G=1, aiming to balance the losses of the two networks. @@ -228,8 +230,8 @@ Deep CGAN & 0.739 & 3.85 & 16:27 \\ \textbf{cDCGAN} & \textbf{0.899} & \textbf{7.41} & 1:05:27 \\ Medium CGAN+LS & 0.749 & 3.643 & 10:42 \\ cDCGAN+LS & 0.846 & 6.63 & 1:12:39 \\ -CCGAN-G/D=3 & 0.849 & 6.59 & 48:11 \\ -CCGAN-G/D=6 & 0.801 & 6.06 & 36:05 \\ +cDCGAN-G/D=3 & 0.849 & 6.59 & 48:11 \\ +cDCGAN-G/D=6 & 0.801 & 6.06 & 36:05 \\ Medium CGAN DO=0.1 & 0.761 & 3.836 & 10:36 \\ Medium CGAN DO=0.5 & 0.725 & 3.677 & 10:36 \\ Medium CGAN+VBN & 0.735 & 3.82 & 19:38 \\ -- cgit v1.2.3-54-g00ecf