From 17528eb77d8f7c888186a85f17fa60ad0ffd6cf8 Mon Sep 17 00:00:00 2001 From: Vasil Zlatanov Date: Tue, 12 Feb 2019 19:56:57 +0000 Subject: Remove repetitive line --- report/paper.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/report/paper.md b/report/paper.md index 07ced70..d5d5d37 100644 --- a/report/paper.md +++ b/report/paper.md @@ -50,7 +50,7 @@ for K-means 100 cluster centers. \end{center} \end{figure} -Random forests will select a random number of features on which to apply a weak learner (such as axis aligned split) and then select the best feature of the selected ones to perform the split on, based on some criteria (our results use the *Gini index*). The fewer features that are compared for each split the quicker the trees are built and the more random they are. Therefore the randomness parameter can be considered the number of features used when making splits. We evaluate accuracy given different randomness when using a K-means vocabulary in figure \ref{fig:kmeanrandom}. The results in figure \ref{fig:kmeanrandom} keep the number forest size static at 100 trees. We use forest size of 100 as we found that this estimatator count the improvement for $\sqrt{n}$ performance gains tend to plateau. +Random forests will select a random number of features on which to apply a weak learner (such as axis aligned split) and then select the best feature of the selected ones to perform the split on, based on some criteria (our results use the *Gini index*). The fewer features that are compared for each split the quicker the trees are built and the more random they are. Therefore the randomness parameter can be considered the number of features used when making splits. We evaluate accuracy given different randomness when using a K-means vocabulary in figure \ref{fig:kmeanrandom}. The results in the figure use a forest size of 100 as we found that this estimatator count the improvement for $\sqrt{n}$ performance gains tend to plateau. \begin{figure}[H] \begin{center} -- cgit v1.2.3