summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorVasil Zlatanov <v@skozl.com>2019-02-22 01:04:09 +0000
committerVasil Zlatanov <v@skozl.com>2019-02-22 01:04:09 +0000
commit45047c8f6573e115278e23d4b21943be34f3dd50 (patch)
tree0c0fccea08d5a1d247e11cb837818b56929fdfc0
parent92ea2898811d9e40b68f75a22a5da32b1c38de78 (diff)
downloade3-deep-45047c8f6573e115278e23d4b21943be34f3dd50.tar.gz
e3-deep-45047c8f6573e115278e23d4b21943be34f3dd50.tar.bz2
e3-deep-45047c8f6573e115278e23d4b21943be34f3dd50.zip
Start writing interim report
-rw-r--r--report/bibliography.bib51
-rw-r--r--report/build/cw2_vz215_np1915.pdfbin169304 -> 266964 bytes
-rw-r--r--report/fig/allbaselines.pdfbin15002 -> 0 bytes
-rw-r--r--report/fig/baseline.pdfbin15560 -> 0 bytes
-rw-r--r--report/fig/cdist.pdfbin10783 -> 0 bytes
-rw-r--r--report/fig/clusteracc.pdfbin14898 -> 0 bytes
-rw-r--r--report/fig/comparison.pdfbin14877 -> 0 bytes
-rw-r--r--report/fig/eucranklist.pngbin3137716 -> 0 bytes
-rw-r--r--report/fig/jaccard.pdfbin12026 -> 0 bytes
-rw-r--r--report/fig/kmeanacc.pdfbin13948 -> 0 bytes
-rw-r--r--report/fig/lambda_acc.pdfbin13808 -> 0 bytes
-rw-r--r--report/fig/lambda_acc_tr.pdfbin13822 -> 0 bytes
-rw-r--r--report/fig/mAP.pdfbin14023 -> 0 bytes
-rw-r--r--report/fig/mahalanobis.pdfbin41611 -> 0 bytes
-rw-r--r--report/fig/pqvals.pdfbin14356 -> 0 bytes
-rw-r--r--report/fig/ranklist.pngbin2836148 -> 0 bytes
-rw-r--r--report/fig/rerank.pdfbin375482 -> 0 bytes
-rw-r--r--report/fig/subspace.pdfbin10339 -> 0 bytes
-rw-r--r--report/fig/train_subspace.pdfbin10108 -> 0 bytes
-rw-r--r--report/fig/trainpqvals.pdfbin14355 -> 0 bytes
-rw-r--r--report/paper.md50
21 files changed, 54 insertions, 47 deletions
diff --git a/report/bibliography.bib b/report/bibliography.bib
index 8890439..0617fac 100644
--- a/report/bibliography.bib
+++ b/report/bibliography.bib
@@ -1,42 +1,13 @@
-@article{rerank-paper,
- author = {Zhun Zhong and
- Liang Zheng and
- Donglin Cao and
- Shaozi Li},
- title = {Re-ranking Person Re-identification with k-reciprocal Encoding},
- journal = {CoRR},
- volume = {abs/1701.08398},
- year = {2017},
- url = {http://arxiv.org/abs/1701.08398},
- archivePrefix = {arXiv},
- eprint = {1701.08398},
- timestamp = {Mon, 13 Aug 2018 16:47:43 +0200},
- biburl = {https://dblp.org/rec/bib/journals/corr/ZhongZCL17},
- bibsource = {dblp computer science bibliography, https://dblp.org}
-}
-
-@article{mAP,
- author = {Jonathan Hui},
- title = {mAP (mean Average Precision) for Object Detection},
- year = {2018},
- url = {https://medium.com/@jonathan_hui/map-mean-average-precision-for-object-detection-45c121a31173},
-}
-
-@inproceedings{deepreid,
-title={DeepReID: Deep Filter Pairing Neural Network for Person Re-identification},
-author={Li, Wei and Zhao, Rui and Xiao, Tong and Wang, Xiaogang},
-booktitle={CVPR},
-year={2014}
-}
+@InProceedings{patches,
+author={Vassileios Balntas and Karel Lenc and Andrea Vedaldi and Krystian Mikolajczyk},
+title = {HPatches: A benchmark and evaluation of handcrafted and learned local descriptors},
+booktitle = {CVPR},
+year = {2017}}
-@article{sklearn,
- title={Scikit-learn: Machine Learning in {P}ython},
- author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
- and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
- and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
- Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
- journal={Journal of Machine Learning Research},
- volume={12},
- pages={2825--2830},
- year={2011}
+@InProceedings{l2net,
+author = {Tian, Yurun and Fan, Bin and Wu, Fuchao},
+title = {L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space},
+booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
+month = {July},
+year = {2017}
}
diff --git a/report/build/cw2_vz215_np1915.pdf b/report/build/cw2_vz215_np1915.pdf
index 89e349d..7eebb9d 100644
--- a/report/build/cw2_vz215_np1915.pdf
+++ b/report/build/cw2_vz215_np1915.pdf
Binary files differ
diff --git a/report/fig/allbaselines.pdf b/report/fig/allbaselines.pdf
deleted file mode 100644
index 15b1aff..0000000
--- a/report/fig/allbaselines.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/baseline.pdf b/report/fig/baseline.pdf
deleted file mode 100644
index b8ebbe4..0000000
--- a/report/fig/baseline.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/cdist.pdf b/report/fig/cdist.pdf
deleted file mode 100644
index 81d712c..0000000
--- a/report/fig/cdist.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/clusteracc.pdf b/report/fig/clusteracc.pdf
deleted file mode 100644
index 09459c4..0000000
--- a/report/fig/clusteracc.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/comparison.pdf b/report/fig/comparison.pdf
deleted file mode 100644
index e755c10..0000000
--- a/report/fig/comparison.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/eucranklist.png b/report/fig/eucranklist.png
deleted file mode 100644
index 3f03a73..0000000
--- a/report/fig/eucranklist.png
+++ /dev/null
Binary files differ
diff --git a/report/fig/jaccard.pdf b/report/fig/jaccard.pdf
deleted file mode 100644
index 53cf407..0000000
--- a/report/fig/jaccard.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/kmeanacc.pdf b/report/fig/kmeanacc.pdf
deleted file mode 100644
index 4b6e33b..0000000
--- a/report/fig/kmeanacc.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/lambda_acc.pdf b/report/fig/lambda_acc.pdf
deleted file mode 100644
index 9c3f749..0000000
--- a/report/fig/lambda_acc.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/lambda_acc_tr.pdf b/report/fig/lambda_acc_tr.pdf
deleted file mode 100644
index 8041977..0000000
--- a/report/fig/lambda_acc_tr.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/mAP.pdf b/report/fig/mAP.pdf
deleted file mode 100644
index 0cf7156..0000000
--- a/report/fig/mAP.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/mahalanobis.pdf b/report/fig/mahalanobis.pdf
deleted file mode 100644
index 2362cfa..0000000
--- a/report/fig/mahalanobis.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/pqvals.pdf b/report/fig/pqvals.pdf
deleted file mode 100644
index 43cccdb..0000000
--- a/report/fig/pqvals.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/ranklist.png b/report/fig/ranklist.png
deleted file mode 100644
index 1cb2647..0000000
--- a/report/fig/ranklist.png
+++ /dev/null
Binary files differ
diff --git a/report/fig/rerank.pdf b/report/fig/rerank.pdf
deleted file mode 100644
index a016189..0000000
--- a/report/fig/rerank.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/subspace.pdf b/report/fig/subspace.pdf
deleted file mode 100644
index 2ab1f5b..0000000
--- a/report/fig/subspace.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/train_subspace.pdf b/report/fig/train_subspace.pdf
deleted file mode 100644
index 5ccdd36..0000000
--- a/report/fig/train_subspace.pdf
+++ /dev/null
Binary files differ
diff --git a/report/fig/trainpqvals.pdf b/report/fig/trainpqvals.pdf
deleted file mode 100644
index 210fea4..0000000
--- a/report/fig/trainpqvals.pdf
+++ /dev/null
Binary files differ
diff --git a/report/paper.md b/report/paper.md
index 74b56e5..3b05cf4 100644
--- a/report/paper.md
+++ b/report/paper.md
@@ -1,11 +1,47 @@
-# Probelm Definition
+# Probelm Definion
-# Denoise Network
+This coursework's goal is to develop an image representation for measuring similarity between patches from the `HPatches` dataset. The `HPatches` dataset contains pacthes sampled from image sequences, where each sequence contains images of the same scenes. Patches are separeted into `i_X` patches which ahve undergone illumintation changes and `v_X` patches which have undergone viewpoint changes. For each image sequence there is a reference image with corresponding reference patches, and two more flie `eX.png` and `hX.png` containing corresponding pacthes from the images in the sequence with altered illumintation or viewpoint. Corresponding patches are extracted by adding geometric noise, easy `e_X` have a small amount of jitter while `h_X` patches have more[@patches]. The patches as processed by our networks are monochrome 32 by 32 images.
-## Descriptor L2 Network
+## Tasks
-* Say where it is form
-* How it trians
-* How it performs
+The task is to train a network, which given a patch is able to produce a descriptor vector with a dimension of 128. The descriptors are evaluated based on there performance across three tasks:
-# Performance and Evaluation
+* Retrieval: Use a given image's descriptor to find similar images in a large gallery
+* Matching: Use a given image's descriptor to find similar in a small gallery with difficult distractors
+* Verificaiton: Given two images, use the descriptors to determine their similarity
+
+# Baseline Model
+
+The baseline model provided in the given IPython notebook, approaches the problem by using two networks for the task.
+
+## Shallow U-Net
+
+A shallow version of the U-Net network is used to denoise the noisy patches. The shallow U-Net network has the same output size as the input size, , is fed a noisy image and has loss computed as the euclidean distance with a clean reference patch. This effectively teaches the U-Net autoencoder to perform a denoising operation on the input images.
+
+Efficient training can performed with TPU acceleartion, a batch size of 4096 and the Adam optimizer with learning rate of 0.001 and is shown on figure \ref{fig:denoise}. Training and validation was performed with **all** available data.
+
+The network is able to achieve a mean average error of 5.3 after 19 epochs. With gradient descent we observed a loss of 5.5 after the same number of epochs. We do not observe evidence of overfitting with the shallow net, something which may be expected with a such a shallow network. An example of denoising as performed by the network is visible in figure \ref{fig:den3}.
+
+Quick experimentation with a deeper version of U-Net shows it is possible to achieve validation loss of below 5.0 after training for 10 epochs, and a equivalent to the shallow loss of 5.3 is achievable aftery only 3 epochs.
+
+## L2 Net
+
+The network used to output the 128 dimension descritors is a L2-network with triplet loss as defined in CVPR 17 [@l2net]. L2-Net was specifically for descriptor output of patches and is a very suitable choice for this task. L2-Net is robust architecture which has been developed with the HPatches dataset.
+
+Training of the L2-Net can be done on the noisy images, but it is beneficial to use the denoise images from the U-Net to improve performance. Training the L2-Net with denoised yields training curves shown in \label{fig:descriptor}
+
+### Triplet Loss
+
+The loss used to train the siamese L2 Network:
+$ \mathcal{L] = \textrm{max}(d(a,p) - d(a,n) + \textrm{margin}, 0)$
+
+There is an intrinsic problem that occurs when loss approaches 0, training becomes more difficult as we are throwing away loss data which prevents the network from progerssing significantly past that point. Solutions may involve increase the margin $\alpha$ or addopting a non linear loss which is able to avoid the loss truncation.
+
+# Peformance & Evaluation
+
+
+# Appendix
+![U-Net Training with TPU](fig/denoise.pdf){\label{fig:denoise}}
+![L2-Net](fig/descriptor.pdf){\label{fig:descriptor}
+
+![Denoise example - 20th epoch](fig/denoised.png){\label{fig:den3}