# Codebooks ## K-means codebook A common technique for codebook generation involves utilising K-means clustering on a sample of the image descriptors. In this way descriptors may be mapped to *visual* words which lend themselves to binning and therefore the creation of bag-of-words histograms for the use of classification. In this courseworok 100-thousand descriptors have been selected to build the visual vocabulary from the Caltech dataset. ## Vocabulary size The number of clusters or the number of centroids determine the vocabulary size. ## Bag-of-words histograms of example training/testing images Looking at picture \ref{fig:histo_te} \begin{figure}[H] \begin{center} \includegraphics[height=4em]{fig/hist_test.jpg} \includegraphics[width=20em]{fig/km-histogram.pdf} \caption{Bag-of-words Training histogram} \label{fig:histo_tr} \end{center} \end{figure} \begin{figure}[H] \begin{center} \includegraphics[height=4em]{fig/hist_train.jpg} \includegraphics[width=20em]{fig/km-histtest.pdf} \caption{Bag-of-words Testing histogram} \label{fig:histo_te} \end{center} \end{figure} ## Vector quantisation process # RF classifier ## Hyperparameters tuning Figure \ref{fig:km-tree-param} shows the effect of tree depth and number of trees for kmean 100 cluster centers. \begin{figure}[H] \begin{center} \includegraphics[width=12em]{fig/error_depth_kmean100.pdf} \includegraphics[width=12em]{fig/trees_kmean.pdf} \caption{Classification error varying trees depth(left) and numbers of trees(right)} \label{fig:km-tree-param} \end{center} \end{figure} Figure \ref{fig:kmeanrandom} shows randomness parameter for kmean 100. \begin{figure}[H] \begin{center} \includegraphics[width=18em]{fig/new_kmean_random.pdf} \caption{newkmeanrandom} \label{fig:kmeanrandom} \end{center} \end{figure} ## Weak Learners comparison In figure \ref{fig:2pt} it is possible to notice an improvement in recognition accuracy by 1%, with the two pixels test, achieving better results than the axis-aligned counterpart. The two-pixels test however brings a slight deacrease in time performance which has been measured to be on **average 3 seconds** more. This is due to the complexity added by the two-pixels test, since it adds one dimension to the computation. \begin{figure}[H] \begin{center} \includegraphics[width=18em]{fig/2pixels_kmean.pdf} \caption{Kmean classification accuracy changing the type of weak learners} \label{fig:2pt} \end{center} \end{figure} ## Impact of the vocabulary size on classification accuracy. \begin{figure}[H] \begin{center} \includegraphics[width=12em]{fig/kmeans_vocsize.pdf} \includegraphics[width=12em]{fig/time_kmeans.pdf} \caption{Effect of vocabulary size; classification error left, time right} \label{fig:km_vocsize} \end{center} \end{figure} ## Confusion matrix for case XXX, with examples of failure and success \begin{figure}[H] \begin{center} \includegraphics[width=18em]{fig/e100k256d5_cm.pdf} \caption{e100k256d5cm Kmean Confusion Matrix} \label{fig:km_cm} \end{center} \end{figure} \begin{figure}[H] \begin{center} \includegraphics[width=10em]{fig/success_km.pdf} \includegraphics[width=10em]{fig/fail_km.pdf} \caption{Kmean: Success on the left; Failure on the right} \label{fig:km_succ} \end{center} \end{figure} # RF codebook In Q1, replace the K-means with the random forest codebook, i.e. applying RF to 128 dimensional descriptor vectors with their image category labels, and using the RF leaves as the visual vocabulary. With the bag-of-words representations of images obtained by the RF codebook, train and test Random Forest classifier similar to Q2. Try different parameters of the RF codebook and RF classifier, and show/discuss the results in comparison with the results of Q2, including the vector quantisation complexity. \begin{figure}[H] \begin{center} \includegraphics[width=18em]{fig/256t1_e200D5_cm.pdf} \caption{Part 3 confusion matrix e100k256d5cm} \label{fig:p3_cm} \end{center} \end{figure} \begin{figure}[H] \begin{center} \includegraphics[width=10em]{fig/success_3.pdf} \includegraphics[width=10em]{fig/fail_3.pdf} \caption{Part3: Success on the left; Failure on the right} \label{fig:p3_succ} \end{center} \end{figure} \begin{figure}[H] \begin{center} \includegraphics[width=12em]{fig/error_depth_p3.pdf} \includegraphics[width=12em]{fig/trees_p3.pdf} \caption{Classification error varying trees depth(left) and numbers of trees(right)} \label{fig:p3_trees} \end{center} \end{figure} \begin{figure}[H] \begin{center} \includegraphics[width=18em]{fig/p3_rand.pdf} \caption{Effect of randomness parameter on classification error} \label{fig:p3_rand} \end{center} \end{figure} \begin{figure}[H] \begin{center} \includegraphics[width=12em]{fig/p3_vocsize.pdf} \includegraphics[width=12em]{fig/p3_time.pdf} \caption{Effect of vocabulary size; classification error left, time right} \label{fig:p3_voc} \end{center} \end{figure} \begin{figure}[H] \begin{center} \includegraphics[width=18em]{fig/p3_colormap.pdf} \caption{Varying leaves and estimators: effect on accuracy} \label{fig:p3_colormap} \end{center} \end{figure} # Comparison of methods and conclusions # References