From 2503cfe68a47131c364edec96ac66dd00ef928f7 Mon Sep 17 00:00:00 2001 From: nunzip Date: Tue, 12 Feb 2019 20:49:28 +0000 Subject: Small corrections --- report/paper.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/report/paper.md b/report/paper.md index 4ab5924..f7d623a 100644 --- a/report/paper.md +++ b/report/paper.md @@ -6,7 +6,9 @@ A common technique for codebook generation involves utilising K-means clustering image descriptors. In this way descriptors may be mapped to *visual* words which lend themselves to binning and therefore the creation of bag-of-words histograms for the use of classification. -In this courseworok 100-thousand random SIFT descriptors of the Caltech_101 dataset are used to build the K-means visual vocabulary. +In this courseworok 100-thousand random SIFT descriptors (size=128) of the Caltech_101 dataset are used to build the K-means visual vocabulary. + +Both training and testing use 15 randomly selected images from the 10 available classess. ## Vocabulary size @@ -172,7 +174,7 @@ and in many cases the increase in training time would not justify the minimum in For Caltech_101 RF-codebook seems to be the most suitable method to perform RF-classification. It is observable that for the particular dataset we are analysing the class *water_lilly* -is the one that gets misclassified the most, both in k-means and RF-codebook (refer to figures \ref{fig:km_cm} and \ref{fig:p3_cm}. This means that the features obtained +is the one that gets misclassified the most, both in k-means and RF-codebook (refer to figures \ref{fig:km_cm} and \ref{fig:p3_cm}). This means that the features obtained from this class do not guarantee very discriminative splits, hence the first splits in the trees will prioritize features taken from other classes. -- cgit v1.2.3-54-g00ecf