ImageCLEF 2013 Scalable Concept Image Annotation

Masatoshi Hidaka, Naoyuki Gunji and Tatsuya Harada

The feature vectors we used in experiments can be downloaded from here.

downloads

Image feature extraction

We used these library for extracting the image features.
SIFT - VLFeat
C-SIFT - ColorDescriptor
LBP - Guoying Zhao's implementation
GIST - Aude Oliva's implementation
Fisher Vector - Yael

Format of image feature

FV_{SIFT,CSIFT,GIST,LBP}_64_256_code_{train,devel,test}.mat contain fisher vector coded by Product Quantization (PQ). Each mat file contains 'code' matrix, whose column corresponds to each image. Image is ordered by ascii order of image id (not equal to order of train_iids.txt).

The vector is coded with PQ, so decode is needed. FV_{SIFT,CSIFT,GIST,LBP}_64_pq_codebook_256.mat contains codebook required to decode. They contains 'codebook' matrix, the decode can be performed by the following (MATLAB code):

d = size(codebook,1);
G = d / size(code, 1);
image_index = 1;% 1 to 250000
result = double(reshape(code(:,image_index*ones(G,1))',d,1))*d + (1:d)';
The result becomes 262144 dimension.

Textual feature

labelfeat_fvorder_{devel,test}_n0fsh.txt.gz are gziped text file which contains label data for each image. Each column corresponds to training image. Each row corresponds to label, whose order is same as {devel,test}_concepts.txt in the dataset. If the value is 1.0, the label is assigned to the image.

Contact: hidaka (at) mi.t.u-tokyo.ac.jp