aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorVasil Zlatanov <v@skozl.com>2018-11-19 17:54:22 +0000
committerVasil Zlatanov <v@skozl.com>2018-11-19 17:54:22 +0000
commitb8495065f086c784d37215704b49a71b3eefc5cb (patch)
tree8c52bc34ab62a43d2589f1cbdfd5dec8072d0aa5
parent59888faaa22d4af43a69707a865ab0264abd3afd (diff)
downloadvz215_np1915-b8495065f086c784d37215704b49a71b3eefc5cb.tar.gz
vz215_np1915-b8495065f086c784d37215704b49a71b3eefc5cb.tar.bz2
vz215_np1915-b8495065f086c784d37215704b49a71b3eefc5cb.zip
Add more plots for part 3
-rwxr-xr-xreport/paper.md7
1 files changed, 5 insertions, 2 deletions
diff --git a/report/paper.md b/report/paper.md
index b8f444e..b1f72f0 100755
--- a/report/paper.md
+++ b/report/paper.md
@@ -403,7 +403,7 @@ This technique does is not bias towards statistically better models and values a
Given that the model can output confidence about the label it is able to predict, we can factor the confidence of the model towards the final output of the committee machine. For instance, if a specialised model says with 95% confidence the label for the test case is "A", and two other models only classify it as "B" with 40% confidence, we would be inclined to trust the first model and classify the result as "A".
-This technique is reliant on the model producing a confidence score for the label(s) it guesses. For K-Nearest neighbours where $K \gt 1$ we may produce a confidence based on the proportion of the K nearest neighbours which are the same class. For instance if $K = 5$ and 3 out of the 5 nearest neighbours are of class "C" and the other two are class "B" and "D", then we may say that the predictions are classes C, B and D, with confidence of 60%, 20% and 20% respectively.
+This technique is reliant on the model producing a confidence score for the label(s) it guesses. For K-Nearest neighbours where $K > 1$ we may produce a confidence based on the proportion of the K nearest neighbours which are the same class. For instance if $K = 5$ and 3 out of the 5 nearest neighbours are of class "C" and the other two are class "B" and "D", then we may say that the predictions are classes C, B and D, with confidence of 60%, 20% and 20% respectively.
In our testing we have elected to use a committee machine employing majority voting, as we identified that looking a nearest neighbour strategy with only **one** neighbour ($K=1$) performed best.
@@ -433,7 +433,10 @@ The randomness hyper-parameter regarding feature space randomsiation can be defi
The optimal number of constant and random eigenvectors to use is therefore an interesting question.
-The optimal randomness after doing an exhaustive search peaks at 95 randomised eigenvalues out of 155 total eigenvalues, or 60 static and 95 random eigenvalues.
+![Optimal M and Randomness Hyperparameter\label{fig:opti-rand}](fig/vaskplot1.pdf)
+![Optimal M and Randomness Hyperparameter\label{fig:opti-rand2}](fig/vaskplot3.pdf)
+
+The optimal randomness after doing an exhaustive search as seen on figure \label{fig:opti-rand}peaks at 95 randomised eigenvalues out of 155 total eigenvalues, or 60 static and 95 random eigenvalues. The values of $M_{\textrm{lda}}$ in the figures is the maximum of 51.
## Comparison