From d93a0336889147cbfe2f83720fa950cec61ac94b Mon Sep 17 00:00:00 2001 From: Vasil Zlatanov Date: Fri, 15 Mar 2019 22:06:16 +0000 Subject: Improve page 3 --- report/paper.md | 40 ++++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/report/paper.md b/report/paper.md index afcb418..fd8512f 100644 --- a/report/paper.md +++ b/report/paper.md @@ -119,9 +119,8 @@ The image quality is better than the two examples reported earlier, proving that Unlike DCGAN, the three levels of dropout rate attempted do not affect the performance significantly, and as we can see in figures \ref{fig:cg_drop1_1} (0.1), \ref{fig:cmed}(0.3) and \ref{fig:cg_drop2_1}(0.5), both image quality and G-D losses are comparable. -The biggest improvement in performance is obtained through one-sided label smoothing, shifting the true labels form 1 to 0.9 to reinforce discriminator behaviour. -Using 0.1 instead of zero for the fake labels does not improve performance, as the discriminator loses incentive to do better (generator behaviour is reinforced). -Performance results for one-sided labels smoothing with `true_labels = 0.9` are shown in figure \ref{fig:smooth}. +The biggest improvement in performance is obtained through one-sided label smoothing, shifting the true labels form 1 to 0.9. +Using 0.1 instead of zero for the fake labels does not improve performance, as the discriminator loses incentive to do better (generator behaviour is reinforced) [@improved]. Performance results for one-sided labels smoothing with `true_labels = 0.9` are shown in figure \ref{fig:smooth}. \begin{figure} \begin{center} @@ -131,16 +130,17 @@ Performance results for one-sided labels smoothing with `true_labels = 0.9` are \end{center} \end{figure} -Virtual Batch Normalization provides results that are difficult to qualitatively assess when compared to the ones obtained through the baseline. -Applying this technique to Medium CGAN keeps G-D losses -mostly unchanged. The biggest change we expect to see is a lower dependence of the output on the individual batches. We expect this aspect to mostly affect -performance when training a classifier with the generated images from CGAN, as we will generate more robust output samples. Training with a larger batch size -would result in even more difficult changes to observe, but since we set `batch_size=128` we expect to see clearer results when performing quantitative measurements. +Virtual Batch Normalization provides results that are +difficult to qualitatively assess when compared to the ones obtained through the baseline. +VBN application does not significantly affect the G-D curves. +. We expect it to affect +performance most when training a classifier with the generated images from CGAN, as we +will generate more robust output samples. Training with a larger batch size +may result in even more difficult changes to observe, but since we ran for a batch_size of 128 we see definite effects when looking results when performing quantitative measurements. Similarly to DCGAN, changing the G-D steps did not lead to good quality results as it can be seen in figure \ref{fig:cbalance}, in which we tried to train with D/G=15 for 10,000 batches, trying to initialize good discriminator weights, to then revert to a D/G=1, aiming to balance the losses of the two networks. -Even in the case of a shallow network, in which mode collapse should have been more likely, we observed diversity between the samples produced for -the same classes, indicating that mode collapse still did not occur. +Even for shallow network, where we initially expected mode collapse, we found a diversity between the samples produced for the same classes, showing the contrary. \begin{figure} \begin{center} @@ -168,9 +168,9 @@ We find a good balance for 12,000 batches. \end{figure} Oscillation on the generator loss is noticeable in figure \ref{fig:cdcloss} due to the discriminator loss approaching zero. One possible -adjustment to tackle this issue was balancing G-D training steps, opting for G/D=3, allowing the generator to gain some advantage over the discriminator. This +adjustment to tackle this issue was balancing G-D training steps, is using unbalanced proportion training steps, such as $G/D=3$, allowing the generator to gain some advantage over the discriminator. This technique allowed to smooth oscillation while producing images of similar quality. -Using G/D=6 dampens oscillation almost completely leading to the vanishing discriminator's gradient issue. Mode collapse occurs in this specific case as shown on +Using $G/D=6$ dampens oscillation almost completely leading to the vanishing discriminator's gradient issue. Mode collapse occurs in this specific case as shown on figure \ref{fig:cdccollapse}. Checking the PCA embeddings extracted from a pretrained LeNet classifier (figure \ref{fig:clustcollapse}) we observe low diversity between features of each class, that tend to collapse to very small regions. @@ -197,18 +197,18 @@ tend to collapse to very small regions. Virtual Batch Normalization on this architecture was not attempted as it significantly increased the training time (about twice more). -Introducing one-sided label smoothing produced very similar results (figure \ref{fig:cdcsmooth}), hence a quantitative performance assessment will need to -be performed in the next section to state which ones are better (through Inception Scores). +Introducing one-sided label smoothing produced very similar results (figure \ref{fig:cdcsmooth}), hence a quantitative performance assessment using Inception score is due in the and presented next section. # Inception Score -Inception score is calculated as introduced by Tim Salimans et. al [@improved]. However as we are evaluating MNIST, we use LeNet-5 [@lenet] as the basis of the Inception score. -We use the logits extracted from LeNet: +Inception score is calculated as introduced by Tim Salimans et. al [@improved], used to evaluate the CIFAR-10 dataset. However as we are evaluating MNIST, we use LeNet-5 [@lenet] as the basis of the Inception score, instead of original Inception network. + +To calculate the score we use the logits extracted from LeNet: $$ \textrm{IS}(x) = \exp(\mathbb{E}_x \left( \textrm{KL} ( p(y\mid x) \| p(y) ) \right) ) $$ We further report the classification accuracy as found with LeNet. For coherence purposes the Inception Scores were -calculated training the LeNet classifier under the same conditions across all experiments (100 epochs with `SGD`, `learning rate=0.001`). +calculated training the LeNet classifier under the same conditions across all experiments (100 epochs and gradient descent with a learning rate of 0.001). \begin{table}[H] \begin{tabular}{llll} @@ -233,15 +233,15 @@ Medium CGAN+VBN+LS & 0.763 & 3.91 & 19:43 \\ ### Architecture -We observe increased accruacy as we increase the depth of the GAN arhitecture at the cost of training time. There appears to be diminishing returns with the deeper networks, and larger improvements are achievable with specific optimisation techniques. cDCGAN achieves improved performance in comparison to the other cases analysed as we expected from the results obtained in the previous section, since the samples produced are almost identical to the ones of the original MNIST dataset. +We observe increased accruacy as we increase the depth of the GAN arhitecture at the cost of training time. There appears to be diminishing returns with the deeper networks, and larger improvements are achievable with specific optimisation techniques. cDCGAN achieves improved performance in comparison to the other networks, as expected during the qualitative observations, where we found the samples produced are almost indistinguishable to the ones of the original MNIST dataset. ### One Side Label Smoothing -One sided label smoothing involves relaxing our confidence on data labels. Tim Salimans et. al. [@improved] show smoothing of the positive labels reduces the vulnerability of the neural network to adversarial examples. We observe significant improvements to the Inception Score and classification accuracy in the case of our baseline (Medium CGAN). This technique however did not improve the performance of cDCGAN any further, suggesting that reinforcing discriminator behaviour does not benefit the system in this case. +One sided label smoothing involves relaxing our confidence on data labels. Tim Salimans et. al. [@improved] show smoothing of the positive labels reduces the vulnerability of the neural network to adversarial examples. We observe significant improvements to the Inception Score and classification accuracy in the case of our baseline (Medium CGAN). This technique however did not improve the performance of cDCGAN any further, suggesting that the shifted discriminator target does not benefit the system in this case. ### Virtual Batch Normalization -Virtual Batch Normalization is a further optimisation technique proposed by Tim Salimans et. al. [@improved]. Virtual batch normalization is a modification to the batch normalization layer, which performs normalization based on statistics from a reference batch. We observe that VBN improved the classification accuracy and the Inception Score due to the provided reduction in output dependency from the individual batches, ultimately resulting in a higher samples' quality. +Virtual Batch Normalization is a further optimisation technique proposed by Tim Salimans et. al. [@improved]. Virtual batch normalization is a modification to the batch normalization layer, which performs normalization based on statistics from a reference batch. VBN ipmrvoes the dependency of the output on the other inputs from the same minibatch [@improved]. We observe that VBN improved the classification accuracy and the Inception Score. ### Dropout -- cgit v1.2.3-54-g00ecf