aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--report/paper.md6
1 files changed, 4 insertions, 2 deletions
diff --git a/report/paper.md b/report/paper.md
index 4c146c8..9c7e5bc 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -6,7 +6,9 @@ A common technique for codebook generation involves utilising K-means clustering
image descriptors. In this way descriptors may be mapped to *visual* words which lend themselves to
binning and therefore the creation of bag-of-words histograms for the use of classification.
-In this courseworok 100-thousand random SIFT descriptors of the Caltech_101 dataset are used to build the K-means visual vocabulary.
+In this courseworok 100-thousand random SIFT descriptors (size=128) of the Caltech_101 dataset are used to build the K-means visual vocabulary.
+
+Both training and testing use 15 randomly selected images from the 10 available classess.
## Vocabulary size
@@ -168,7 +170,7 @@ and in many cases the increase in training time would not justify the minimum in
For Caltech_101 RF-codebook seems to be the most suitable method to perform RF-classification.
It is observable that for the particular dataset we are analysing the class *water_lilly*
-is the one that gets misclassified the most, both in K-means and RF codebooks (refer to figures \ref{fig:km_cm} and \ref{fig:p3_cm}. This means that the features obtained
+is the one that gets misclassified the most, both in k-means and RF-codebook (refer to figures \ref{fig:km_cm} and \ref{fig:p3_cm}). This means that the features obtained
from this class do not guarantee very discriminative splits, hence the first splits in the trees
will prioritize features taken from other classes.