aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorVasil Zlatanov <v@skozl.com>2019-02-15 17:41:43 +0000
committerVasil Zlatanov <v@skozl.com>2019-02-15 17:41:43 +0000
commit404d54d233e6d1b3616a9a38a9421a0f06513be3 (patch)
tree2893f7cb5260c281eb8e42674968b8082fd2f4e7
parent4f9214360fffadd86fb767f3b2322d657567851d (diff)
downloade4-vision-404d54d233e6d1b3616a9a38a9421a0f06513be3.tar.gz
e4-vision-404d54d233e6d1b3616a9a38a9421a0f06513be3.tar.bz2
e4-vision-404d54d233e6d1b3616a9a38a9421a0f06513be3.zip
Small changes
-rw-r--r--report/paper.md8
1 files changed, 4 insertions, 4 deletions
diff --git a/report/paper.md b/report/paper.md
index e44444b..6c7c0ed 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -4,7 +4,7 @@ A common technique for codebook generation involves utilising K-means clustering
image descriptors. In this way descriptors may be mapped to *visual* words which lend themselves to
binning and therefore the creation of bag-of-words histograms for the use of classification.
-In this courseworok 100-thousand random SIFT descriptors (with 128 dimenions) of the Caltech_101 dataset are used to build the K-means visual vocabulary.
+In this courseworok 100-thousand random SIFT descriptors (with 128 dimenions) of the `Caltech_101` dataset are used to build the K-means visual vocabulary.
Both training and testing use 15 randomly selected images from the 10 available classess.
@@ -14,7 +14,7 @@ The number of clusters or the number of centroids determines the vocabulary size
## Bag-of-words histogram of descriptor vectors
-An example of histograms for training and testing images is shown on figure \ref{fig:histo_tr}, computed with a vocubulary size of 100. The histograms of the same class appear to have comparable magnitudes for their respective keywords, demonstrating they have a similar number of descriptors which map to each of the clusters. The effect of the vocubalary size (as determined by the number of K-means centroids) on the classificaiton accuracy is shown in figure \ref{fig:km_vocsize}. We find that a small vocabulary size tends to misrepresent the information contained in the different patches, resulting in poor classification accuracy. Conversly a large vocabulary size (many K-mean centroids), may display overfitting. In our tests, we begin to observe a plateau after a cluster count of 60 on figure \ref{fig:km_vocsize}. The proccess of partitioning the input space into K distinct clusters is a form of **vector quantisation**.
+An example of histograms for training and testing images is shown on figure \ref{fig:histo_tr}, computed with a vocubulary size of 100. The histograms of the same class appear to have comparable magnitudes for their respective keywords, demonstrating they have a similar number of descriptors which map to each of the clusters. The effect of the vocubalary size (as determined by the number of K-means centroids) on the classificaiton accuracy is shown in figure \ref{fig:km_vocsize}. We find that a small vocabulary size tends to misrepresent the information contained in the different patches, resulting in poor classification accuracy. Conversly a large vocabulary size (many K-mean centroids), may display overfitting. In our tests, we begin to observe a plateau after a cluster count of 60 on figure \ref{fig:km_vocsize}. The proccess of partitioning the input space into K distinct clusters is a form of **vector quantization**.
\begin{figure}
\begin{center}
@@ -26,7 +26,7 @@ An example of histograms for training and testing images is shown on figure \ref
\end{figure}
-The time complexity of quantisation with a K-means codebooks is $O(DNK)$, where N is the number of entities to be clustered (descriptors), D is the dimension (of the descriptors) and K is the cluster count [@km-complexity]. As the computation time is high, the tests use a subsample of descriptors to compute the centroids (a random selection of 100 thousand descriptors). An alternative method we tried is applying PCA to the descriptors vectors to improve time performance. However, the descriptor dimension of 128 is relatiely small and as such we found PCA to be unnecessary.
+The time complexity of quantization with a K-means codebooks is $O(DNK)$, where N is the number of entities to be clustered (descriptors), D is the dimension (of the descriptors) and K is the cluster count [@km-complexity]. As the computation time is high, the tests use a subsample of descriptors to compute the centroids (a random selection of 100 thousand descriptors). An alternative method we tried is applying PCA to the descriptors vectors to improve time performance. However, the descriptor dimension of 128 is relatiely small and as such we found PCA to be unnecessary.
K-means is a process that converges to local optima and heavily depends on the initialization values of the centroids.
Initializing K-means is an expensive process, based on sequential attempts of centroids placement. Running for multiple instances significantly affects the computation process, leading to a linear increase in execution time. We did not observe increase in accuracy with more than one K-means clusters initializations, and therefore present results for accuracy and execution time with a single K-Means initialization.
@@ -146,7 +146,7 @@ As discussed in section I, due to the initialization process for optimal centroi
descriptor counts (and in absence of methods for dimensionality reduction).
In many applications the increase in training time would not justify the small increase in classification performance.
-For the Caltech_101 dataset, a RF codebook seems to be the most suitable method to perform RF classification.
+For the `Caltech_101` dataset, a RF codebook seems to be the most suitable method to perform RF classification.
The `water_lilly` is the most misclassified class, both for K-means and RF codebook (refer to figures \ref{fig:km_cm} and \ref{fig:p3_cm}). This indicates that the features obtained from the class do not provide for very discriminative splits, resulting in the prioritsation of other features in the first nodes of the decision trees.