diff options
-rw-r--r-- | report/paper.md | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/report/paper.md b/report/paper.md index 07ced70..d5d5d37 100644 --- a/report/paper.md +++ b/report/paper.md @@ -50,7 +50,7 @@ for K-means 100 cluster centers. \end{center} \end{figure} -Random forests will select a random number of features on which to apply a weak learner (such as axis aligned split) and then select the best feature of the selected ones to perform the split on, based on some criteria (our results use the *Gini index*). The fewer features that are compared for each split the quicker the trees are built and the more random they are. Therefore the randomness parameter can be considered the number of features used when making splits. We evaluate accuracy given different randomness when using a K-means vocabulary in figure \ref{fig:kmeanrandom}. The results in figure \ref{fig:kmeanrandom} keep the number forest size static at 100 trees. We use forest size of 100 as we found that this estimatator count the improvement for $\sqrt{n}$ performance gains tend to plateau. +Random forests will select a random number of features on which to apply a weak learner (such as axis aligned split) and then select the best feature of the selected ones to perform the split on, based on some criteria (our results use the *Gini index*). The fewer features that are compared for each split the quicker the trees are built and the more random they are. Therefore the randomness parameter can be considered the number of features used when making splits. We evaluate accuracy given different randomness when using a K-means vocabulary in figure \ref{fig:kmeanrandom}. The results in the figure use a forest size of 100 as we found that this estimatator count the improvement for $\sqrt{n}$ performance gains tend to plateau. \begin{figure}[H] \begin{center} |