aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authornunzip <np.scarh@gmail.com>2018-12-10 17:45:53 +0000
committernunzip <np.scarh@gmail.com>2018-12-10 17:45:53 +0000
commit30583c4ce19bc77e48810894b277857429fbc201 (patch)
tree4099d18333de6958ca87bef27f7e0bcbbf41d336
parented44a6f432cf9e1051edd58e146a54124345adcd (diff)
parent2a5c62f9ea50971ba25c3e8f519e224093ec0090 (diff)
downloadvz215_np1915-30583c4ce19bc77e48810894b277857429fbc201.tar.gz
vz215_np1915-30583c4ce19bc77e48810894b277857429fbc201.tar.bz2
vz215_np1915-30583c4ce19bc77e48810894b277857429fbc201.zip
Merge branch 'master' of git.skozl.com:e4-pattern
-rwxr-xr-xreport2/metadata.yaml2
-rwxr-xr-xreport2/paper.md52
-rwxr-xr-xtest-k.sh13
3 files changed, 45 insertions, 22 deletions
diff --git a/report2/metadata.yaml b/report2/metadata.yaml
index f35d6aa..20375f9 100755
--- a/report2/metadata.yaml
+++ b/report2/metadata.yaml
@@ -10,7 +10,7 @@ abstract: |
This report analyses distance metrics learning techniques with regards to
identification accuracy for the dataset CUHK03. The baseline method used for
identification is Eucdidian based Nearest Neighbors based on Euclidean distance.
- The improved approach we propose utilises Jaccardian metrics to rearrange the NN
+ The improved approach evaluated utilises Jaccardian metrics to rearrange the NN
ranklist based on reciprocal neighbours. While this approach is more complex and introduced new hyperparameter, significant accuracy improvements are observed -
approximately 10% increased Top-1 identifications, and good improvements for Top-$N$ accuracy with low $N$.
...
diff --git a/report2/paper.md b/report2/paper.md
index d9bfd10..2e0bb0a 100755
--- a/report2/paper.md
+++ b/report2/paper.md
@@ -1,4 +1,5 @@
# Summary
+
In this report we analysed how distance metrics learning affects classification
accuracy for the dataset CUHK03. The baseline method used for classification is
Nearest Neighbors based on Euclidean distance. The improved approach we propose
@@ -9,19 +10,28 @@ twice. However it is possible to observe a significant accuracy improvement of
around 10% for the $@rank1$ case. Accuracy improves overall, especially for
$@rankn$ cases with low n.
-# Formulation of the Addresssed Machine Learning Problem
+The person re-identification problem presented in this paper requires mtatching
+pedestrian images from disjoint camera's by pedestrian detectors. This problem
+is challenging, as identities captured in photsos are subject to various
+lighting, pose, blur, background and oclusion from various camera views. This
+report considers features extracted from the CUHK03 dataset, following a 50 layer
+Residual network (Resnet50). This paper considers distance metrics techniques which
+can be used to perform person re-identification across **disjoint* cameras, using
+these features.
## CUHK03
The dataset CUHK03 contains 14096 pictures of people captured from two
-different cameras. The feature vectors used come from passing the
-rescaled images through ResNet50. Each feature vector contains 2048
-features that we use for classification. The pictures represent 1467 different
-people and each of them appears between 9 and 10 times. The separation of
-train_idx, query_idx and gallery_idx allows to perform taining and validation
-on a training set (train_idx, adequately split between test, train and
-validation keeping the same number of identities). This prevents overfitting
-the algorithm to the specific data associated with query_idx and gallery_idx.
+different cameras. The feature vectors used, extracted from a trained ResNet50 model
+, contain 2048 features that are used for identification.
+
+The pictures represent 1467 different identities, each of which appears 9 to 10
+times. Data is seperated in train, query and gallery sets with `train_idx`,
+`query_idx` and `gallery_idx` respectively, where the training set has been used
+to develop the ResNet50 model used for feature extraction. This procedure has
+allowed the evaluation of distance metric learning techniques on the query and
+gallery sets, without an overfit feature set a the set, as it was explicitly
+trained on the training set.
## Probelm to solve
@@ -36,13 +46,21 @@ Nearest Neighbor aims to find the gallery image whose feature are the closest to
the ones of a query image, predicting the class of the query image as the same
of its nearest neighbor(s). The distance between images can be calculated through
different distance metrics, however one of the most commonly used is euclidean
-distance, represented as $d=\sqrt{\sum (x-y)^{2}}$.
+distance:
+
+$$ \textrm{NN}(x) = \operatorname*{argmin}_{i\in[m]} \|x-x_i\|^2 $$
-EXPLAIN KNN BRIEFLY
+*Square root when calculating euclidean distance is ommited as it does not
+affect ranking by distance*
+Alternative distance metrics exist such as jaccardian and mahalanobis, which can
+be used as an alternative to euclidiean distance.
# Baseline Evaluation
+To evaluate improvements brought by alternative distance learning metrics a baseline
+is established as trough nearest neighbour identification as previously described.
+
\begin{figure}
\begin{center}
\includegraphics[width=20em]{fig/baseline.pdf}
@@ -62,6 +80,11 @@ EXPLAIN KNN BRIEFLY
# Suggested Improvement
+## kMean Clustering
+
+
+## k-reciprocal Reranking
+
\begin{figure}
\begin{center}
\includegraphics[width=24em]{fig/ranklist.png}
@@ -150,6 +173,13 @@ EXPLAIN KNN BRIEFLY
\end{center}
\end{figure}
+# Comment on Mahalnobis Distance as a metric
+
+We were not able to achieve significant improvements using mahalanobis for
+original distance ranking compared to square euclidiaen metrics. Results can
+be observed using the `-m|--mahalanobis` when running evalution with the
+repository complimenting this paper.
+
# Conclusion
# References
diff --git a/test-k.sh b/test-k.sh
index 9001047..3713127 100755
--- a/test-k.sh
+++ b/test-k.sh
@@ -1,17 +1,10 @@
#!/bin/bash
-for p in $(seq 3 2 31);do
- for q in $(seq 1 2 5);do
- ((python3 evaluate.py -p $p -q $q -r;echo "p: $p | q: $q";) >> ~/pq-vals.txt) &
+for l in $(seq 0 5 11);do
+ for q in $(seq 1 1 5);do
+ ((python3 evaluate.py -l 0.$((q+l)) -M 10 -r -p 9 -q 5;echo "l: 0.$((q+l)) ";) >> ~/l-vals.txt) &
pid=$!
done
while kill -0 $pid;do
sleep 5
done
- for q in $(seq 7 2 11);do
- ((python3 evaluate.py -p $p -q $q -r;echo "p: $p | q: $q";) >> ~/pq-vals.txt) &
- pid=$!
- done
- while ps -p $pid > /dev/null;do
- sleep 5
done
-done