diff options
-rw-r--r-- | report/paper.md | 26 |
1 files changed, 13 insertions, 13 deletions
diff --git a/report/paper.md b/report/paper.md index 2a9add2..cba59bb 100644 --- a/report/paper.md +++ b/report/paper.md @@ -6,7 +6,7 @@ The person re-identification problem presented in this paper requires matching pedestrian images from disjoint cameras by pedestrian detectors. This problem is challenging, as identities captured in photos are subject to various lighting, pose, blur, background and occlusion from various camera views. This report considers -features extracted from the CUHK03 dataset, following a 50 layer Residual network +features extracted from the CUHK03 dataset, following a 50 layer Residual Network (ResNet-50). Different distance metrics techniques can be used to perform person re-identification across the *disjoint* cameras. @@ -31,11 +31,11 @@ gallery sets, with the knowledge that we are not comparing features produced by a neural network which was specifically (over-)fitted on them, as they were extracted based on the model derived from the training set. -## Nearest Neighbor rank-list +## Nearest Neighbour rank-list -Nearest Neighbor can be used to find gallery images with features close to +Nearest Neighbour can be used to find gallery images with features close to those of a query image, predicting the class or identify of the query image as -the same of its nearest neighbor(s), based on distance. +the same of its nearest neighbour(s), based on distance. The distance between images can be calculated through different distance metrics. The most commonly used is euclidean distance: @@ -55,7 +55,7 @@ classification and identification technique which is effective with good feature \begin{figure} \begin{center} \includegraphics[width=20em]{fig/baseline.pdf} -\caption{Top k identification accuracy of baseline Nearest Neighbor} +\caption{Top k identification accuracy of baseline Nearest Neighbour} \label{fig:baselineacc} \end{center} \end{figure} @@ -119,7 +119,7 @@ improve identification accuracy, we consider it an additional baseline. Mahalanobis distance is an extremely useful measure of distance between a point (such as a query image feature vector) and a distribution (such as the gallery), which accounts for how many standard deviations away the point is from the mean of the distribution. It is unitless and scale invariant, in a way standartising the data. -We used Mahalanobis distance as an altenrative metric to Euclidean distance: +We used Mahalanobis distance as an alternative metric to Euclidean distance: $$ d_M(p,g_i) = (p-g_i)^TM(p-g_i). $$ @@ -155,7 +155,7 @@ introduced in reference @rerank-paper. We define $N(p,k)$ as the top $k$ elements of the rank-list generated through NN, where $p$ is a query image. The k reciprocal rank-list, $R(p,k)$ is defined as the intersection $R(p,k)=\{g_i|(g_i \in N(p,k))\land(p \in N(g_i,k))\}$. Adding -$\frac{1}{2}k$ reciprocal nearest neighbors of each element in the rank-list +$\frac{1}{2}k$ reciprocal nearest neighbours of each element in the rank-list $R(p,k)$, it is possible to form a more reliable set $R^*(p,k) \longleftarrow R(p,k) \cup R(q,\frac{1}{2}k)$ that aims to overcome the problem of query and gallery images being affected by factors such @@ -165,10 +165,10 @@ recalculate the distance between query and gallery images. Jaccard metric of the $k$-reciprocal sets is used to calculate the distance between $p$ and $g_i$ as: $$d_J(p,g_i)=1-\frac{|R^*(p,k)\cap R^*(g_i,k)|}{|R^*(p,k)\cup R^*(g_i,k)|}.$$ -However, since the neighbors of the query $p$ are close to $g_i$ as well, +However, since the neighbours of the query $p$ are close to $g_i$ as well, they would be more likely to be identified as true positive. This implies the need of a more discriminative method, which is achieved -encoding the $k$-reciprocal neighbors into an $N$-dimensional vector as a function +encoding the $k$-reciprocal neighbours into an $N$-dimensional vector as a function of the original distance (in our case square euclidean $d(p,g_i) = \|p-g_i\|^2$) through the gaussian kernell: @@ -184,11 +184,11 @@ Through this transformation it is possible to reformulate the distance with the $$ d_J(p,g_i)=1-\frac{\sum\limits_{j=1}^N min(V_{p,g_j},V_{g_i,g_j})}{\sum\limits_{j=1}^N max(V_{p,g_j},V_{g_i,g_j})}. $$ -It is then possible to perform a local query expansion using the g\textsubscript{i} neighbors of +It is then possible to perform a local query expansion using the g\textsubscript{i} neighbours of defined as: $$ V_p=\frac{1}{|N(p,k_2)|}\sum\limits_{g_i\in N(p,k_2)}V_{g_i}. $$ -We refer to $k_2$ since it allows constricting the the size of the neighbors to prevent noise -from the $k_2$ neighbors. The dimension k of the *$R^*$* set will instead +We refer to $k_2$ since it allows constricting the the size of the neighbours to prevent noise +from the $k_2$ neighbours. The dimension k of the *$R^*$* set will instead be defined as $k_1$: $R^*(g_i,k_1)$. The distances obtained are then combined, obtaining a final distance $d^*(p,g_i)$ that is used to obtain the @@ -202,7 +202,7 @@ This is done using a multi-direction search algorithm to estimate $k_{1_{opt}}$ and an exhaustive search for $\lambda$ from $\lambda = 0$ (exclusively Jaccard distance) to $\lambda = 1$ (only original distance) in steps of 0.1. The results obtained through this approach suggest: $k_{1_{opt}}=9, k_{2_{opt}}=3, 0.1\leq\lambda_{opt}\leq 0.3$. -To verify optimisation of $k_{1_{opt}}$, $k_{2_{opt}}$ heat plots were performed heat on +To verify optimisation of $k_{1_{opt}}$, $k_{2_{opt}}$ heat plots were performed on figures \ref{fig:pqvals}, showing that the optimal values obtained from training are close to the ones for the local maximum of gallery and query. The $\lambda$ search is plotted on figure \ref{fig:lambda}. |