# Summary In this report we analysed how distance metrics learning affects classification accuracy for the dataset CUHK03. The baseline method used for classification is Nearest Neighbors based on Euclidean distance. The improved approach we propose mixes Jaccardian and Mahalanobis metrics to obtain a ranklist that takes into account also the reciprocal neighbors. This approach is computationally more complex, since the matrices representing distances are effectively calculated twice. However it is possible to observe a significant accuracy improvement of around 10% for the $@rank1$ case. Accuracy improves overall, especially for $@rankn$ cases with low n. # Formulation of the Addresssed Machine Learning Problem ## CUHK03 The dataset CUHK03 contains 14096 pictures of people captured from two different cameras. The feature vectors used come from passing the rescaled images through ResNet50. Each feature vector contains 2048 features that we use for classification. The pictures represent 1467 different people and each of them appears between 9 and 10 times. The separation of train_idx, query_idx and gallery_idx allows to perform taining and validation on a training set (train_idx, adequately split between test, train and validation keeping the same number of identities). This prevents overfitting the algorithm to the specific data associated with query_idx and gallery_idx. ## Probelm to solve The problem to solve is to create a ranklist for each image of the query set by finding the nearest neighbor(s) within a gallery set. However gallery images with the same label and taken from the same camera as the query image should not be considered when forming the ranklist. ## Nearest Neighbor ranklist Nearest Neighbor aims to find the gallery image whose feature are the closest to the ones of a query image, predicting the class of the query image as the same of its nearest neighbor(s). The distance between images can be calculated through different distance metrics, however one of the most commonly used is euclidean distance, represented as $d=\sqrt{\sum (x-y)^{2}}$. EXPLAIN KNN BRIEFLY # Baseline Evaluation \begin{figure} \begin{center} \includegraphics[width=17em]{fig/baseline.pdf} \caption{Recognition accuracy of baseline Nearest Neighbor @rank k} \label{fig:baselineacc} \end{center} \end{figure} # Suggested Improvement # Conclusion # References # Appendix