From 2503cfe68a47131c364edec96ac66dd00ef928f7 Mon Sep 17 00:00:00 2001
From: nunzip <np.scarh@gmail.com>
Date: Tue, 12 Feb 2019 20:49:28 +0000
Subject: Small corrections

---
 report/paper.md | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/report/paper.md b/report/paper.md
index 4ab5924..f7d623a 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -6,7 +6,9 @@ A common technique for codebook generation involves utilising K-means clustering
 image descriptors. In this way descriptors may be mapped to *visual* words which lend themselves to
 binning and therefore the creation of bag-of-words histograms for the use of classification.
 
-In this courseworok 100-thousand random SIFT descriptors of the Caltech_101 dataset are used to build the K-means visual vocabulary.
+In this courseworok 100-thousand random SIFT descriptors (size=128) of the Caltech_101 dataset are used to build the K-means visual vocabulary.
+
+Both training and testing use 15 randomly selected images from the 10 available classess.
 
 ## Vocabulary size 
 
@@ -172,7 +174,7 @@ and in many cases the increase in training time would not justify the minimum in
 For Caltech_101 RF-codebook seems to be the most suitable method to perform RF-classification.
 
 It is observable that for the particular dataset we are analysing the class *water_lilly*
-is the one that gets misclassified the most, both in k-means and RF-codebook (refer to figures \ref{fig:km_cm} and \ref{fig:p3_cm}. This means that the features obtained
+is the one that gets misclassified the most, both in k-means and RF-codebook (refer to figures \ref{fig:km_cm} and \ref{fig:p3_cm}). This means that the features obtained
 from this class do not guarantee very discriminative splits, hence the first splits in the trees
 will prioritize features taken from other classes.
 
-- 
cgit v1.2.3-70-g09d2