diff options
-rw-r--r-- | report/paper.md | 6 |
1 files changed, 4 insertions, 2 deletions
diff --git a/report/paper.md b/report/paper.md index 4c146c8..9c7e5bc 100644 --- a/report/paper.md +++ b/report/paper.md @@ -6,7 +6,9 @@ A common technique for codebook generation involves utilising K-means clustering image descriptors. In this way descriptors may be mapped to *visual* words which lend themselves to binning and therefore the creation of bag-of-words histograms for the use of classification. -In this courseworok 100-thousand random SIFT descriptors of the Caltech_101 dataset are used to build the K-means visual vocabulary. +In this courseworok 100-thousand random SIFT descriptors (size=128) of the Caltech_101 dataset are used to build the K-means visual vocabulary. + +Both training and testing use 15 randomly selected images from the 10 available classess. ## Vocabulary size @@ -168,7 +170,7 @@ and in many cases the increase in training time would not justify the minimum in For Caltech_101 RF-codebook seems to be the most suitable method to perform RF-classification. It is observable that for the particular dataset we are analysing the class *water_lilly* -is the one that gets misclassified the most, both in K-means and RF codebooks (refer to figures \ref{fig:km_cm} and \ref{fig:p3_cm}. This means that the features obtained +is the one that gets misclassified the most, both in k-means and RF-codebook (refer to figures \ref{fig:km_cm} and \ref{fig:p3_cm}). This means that the features obtained from this class do not guarantee very discriminative splits, hence the first splits in the trees will prioritize features taken from other classes. |