1 files changed, 7 insertions, 7 deletions
diff --git a/report/paper.md b/report/paper.md
index 6ec9f57..e6894bb 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -154,6 +154,8 @@ Medium CGAN+VBN+LS    & 0.783    & 4.31          & 10:38        \\
 
 ### Architecture
 
+We observe increased accruacy as we increase the depth of the arhitecture at the cost of the training time. There appears to be diminishing returns with the deeper networks, and larger improvements are achievable with specific optimisation technques.
+
 ### One Side Label Smoothing
 
 \begin{figure}
@@ -165,17 +167,15 @@ Medium CGAN+VBN+LS    & 0.783    & 4.31          & 10:38        \\
 \end{center}
 \end{figure}
 
-
+One sided label smoothing involves relaxing our confidence on the labels in our data. This lowers the loss target to below 1. Tim Salimans et. al. [@improved] show smoothing of the positive labels reduces the vulnerability of the neural network to adversarial examples. We observe significant improvements to the Inception score and classification accuracy.
 
 ### Virtual Batch Normalisation
 
+Virtual Batch Noramlisation is a further optimisation technique proposed by Tim Salimans et. al. [@improved]. Virtual batch normalisation is a modification to the batch normalisation layer, which performs normalisation based on statistics from a reference batch. We observe that VBN improved the classification accuracy and the Inception score.
 
 ### Dropout
-The effect of dropout for the non-convolutional CGAN architecture does not affect performance as much as in DCGAN, nor does it seem to affect the quality of images produced, together with the G-D loss remain almost unchanged. Results are presented in figures \ref{fig:cg_drop1_1}, \ref{fig:cg_drop1_2}, \ref{fig:cg_drop2_1}, \ref{fig:cg_drop2_2}.
-
-
-**Please measure and discuss the inception scores for the different hyper-parameters/tricks and/or
 
+The effect of dropout for the non-convolutional CGAN architecture does not affect performance as much as in DCGAN, nor does it seem to affect the quality of images produced, together with the G-D loss remain almost unchanged. Results are presented in figures \ref{fig:cg_drop1_1}, \ref{fig:cg_drop1_2}, \ref{fig:cg_drop2_1}, \ref{fig:cg_drop2_2}.
 
 
 # Re-training the handwritten digit classifier
@@ -252,11 +252,11 @@ boosted to 92%, making this technique the most successfull attempt of improvemen
 Failures classification examples are displayed in figure \ref{fig:retrain_fail}. The results showed indicate that the network we trained is actually performing quite well,
 as most of the testing images that got misclassified (mainly nines and fours) show ambiguities.
 
-# Bonus
+# Bonus Questions
 
 ## Relation to PCA
 
-Similarly to GAN's, PCA can be used to formulate **generative** models of a system. While GAN's are trained neural networks, PCA is a definite statistical procedure which perform orthogonal transformations of the data. While both attempt to identify the most important or *variant* features of the data (which we may then use to generate new data), PCA by itself is only able to extract linearly related features. In a purely linear system, a GAN would be converging to PCA. In a more complicated system, we would indeed to identify relevant kernels in order to extract relevant features with PCA, while a GAN is able to leverage dense and convolutional neural network layers which may be trained to perform relevant transformations.
+Similarly to GAN's, PCA can be used to formulate **generative** models of a system. While GAN's are trained neural networks, PCA is a definite statistical procedure which perform orthogonal transformations of the data. Both attempt to identify the most important or *variant* features of the data (which we may then use to generate new data), but PCA by itself is only able to extract linearly related features. In a purely linear system, a GAN would be converging to PCA. In a more complicated system, we would indeed to identify relevant kernels in order to extract relevant features with PCA, while a GAN is able to leverage dense and convolutional neural network layers which may be trained to perform relevant transformations.
 
 * This is an open question. Do you have any other ideas to improve GANs or
 have more insightful and comparative evaluations of GANs? Ideas are not limited. For instance,