Properly reference

author: Vasil Zlatanov <v@skozl.com> 2019-02-12 20:13:27 +0000
committer: Vasil Zlatanov <v@skozl.com> 2019-02-12 20:13:27 +0000
commit: 288e28070d27c496d6ac4af5676734451f8430e9 (patch)
tree: 370a9c789aed36693f882e2652e1ebf45112d957
parent: 5ebf5cafe3e6b5ab711ddb3b95299f04c0314333 (diff)
download: e4-vision-288e28070d27c496d6ac4af5676734451f8430e9.tar.gz
e4-vision-288e28070d27c496d6ac4af5676734451f8430e9.tar.bz2
e4-vision-288e28070d27c496d6ac4af5676734451f8430e9.zip
1 files changed, 1 insertions, 1 deletions
diff --git a/report/paper.md b/report/paper.md
index af3f8d3..06d8357 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -16,7 +16,7 @@ The number of clusters or the number of centroids determines the vocabulary size
 
 An example histograms for training and testing images is shown on figure \ref{fig:histo_tr}, computed with a vocubulary size of 100. The histograms of the same class appear to have comparable magnitudes for their respective keywords, demonstrating they had a similar number of descriptors which mapped to each of the clusters. The effect of the vocubalary size (as determined by the number of K-means centroids) on the classificaiton accuracy is shown in figure \ref{fig:km_vocsize}. A small vocabulary size tends to misrepresent the information contained in the different patches, resulting in poor classification accuracy. Conversly a large vocabulary size (many K-mean centroids), may display overfitting. In our tests, we observe a plateau after a cluster count of 60 on figure \ref{fig:km_vocsize}.
 
-The time complexity of quantisation with a K-means codebooks is $O(DNK)$, where N is the number of entities to be clustered (descriptors), D is the dimension (of the descriptors) and K is the cluster count @cite[km-complexity]. As the computation time is high, the tests we use a subsample of descriptors to compute the centroids (a random selection of 100 thousand descriptors). An alternative method we tried is applying PCA to the descriptors vectors to improve time performance. However in this case the descriptors' size is relatively small, and for such reason we opted to avoid PCA for further training. 
+The time complexity of quantisation with a K-means codebooks is $O(DNK)$, where N is the number of entities to be clustered (descriptors), D is the dimension (of the descriptors) and K is the cluster count [@km-complexity]. As the computation time is high, the tests we use a subsample of descriptors to compute the centroids (a random selection of 100 thousand descriptors). An alternative method we tried is applying PCA to the descriptors vectors to improve time performance. However in this case the descriptors' size is relatively small, and for such reason we opted to avoid PCA for further training. 
 
 K-means is a process that converges to local optima and heavilly depends on the initialization values of the centroids.
 Initializing k-means is an expensive process, based on sequential attempts of centroids placement.
author	Vasil Zlatanov <v@skozl.com>	2019-02-12 20:13:27 +0000
committer	Vasil Zlatanov <v@skozl.com>	2019-02-12 20:13:27 +0000
commit	288e28070d27c496d6ac4af5676734451f8430e9 (patch)
tree	370a9c789aed36693f882e2652e1ebf45112d957
parent	5ebf5cafe3e6b5ab711ddb3b95299f04c0314333 (diff)
download	e4-vision-288e28070d27c496d6ac4af5676734451f8430e9.tar.gz e4-vision-288e28070d27c496d6ac4af5676734451f8430e9.tar.bz2 e4-vision-288e28070d27c496d6ac4af5676734451f8430e9.zip