aboutsummaryrefslogtreecommitdiff
path: root/report/paper.md
blob: 45d7eb28ab18cfbf832eff28f829bd5a1d2a533f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# K-means codebook 

We randomly select 100k descriptors for K-means clustering for building the visual vocabulary
(due to memory issue). Open the main_guideline.m and select/load the dataset.
>> [data_train, data_test] = getData('Caltech'); % Select dataset
Set 'showImg = 0' in getData.m if you want to stop displaying training and testing images.
Complete getData.m by writing your own lines of code to obtain the visual vocabulary and the
bag-of-words histograms for both training and testing data. Show, measure and
discuss the followings: 

## Vocabulary size 

## Bag-of-words histograms of example training/testing images

## Vector quantisation process

# RF classifier 

Train and test Random Forest using the training and testing data set in the form of bag-of-words
obtained in Q1. Change the RF parameters (including the number of trees, the depth of trees, the
degree of randomness parameter, the type of weak-learners: e.g. axis-aligned or two-pixel test),
and show and discuss the results:

## recognition accuracy, confusion matrix,

## example success/failures,

## time-efficiency of training/testing, 

## impact of the vocabulary size on classification accuracy. 

# RF codebook

In Q1, replace the K-means with the random forest codebook, i.e. applying RF to 128 dimensional
descriptor vectors with their image category labels, and using the RF leaves as the visual
vocabulary. With the bag-of-words representations of images obtained by the RF codebook, train
and test Random Forest classifier similar to Q2. Try different parameters of the RF codebook and
RF classifier, and show/discuss the results in comparison with the results of Q2, including the
vector quantisation complexity. 

# References