Comparison using ILSVRC 2010 1.2M dataset with SIFT+FV. The darker bar for each algorithm shows the accuracy with averaging. The brighter shows the accuracy without averaging for easy reference.
Three guidelines for online learning for large-scale visual recognition
Perceptron can compete against the latest methods.
Provided that the second guideline is observed.
Averaging is necessary for any algorithm.
First-order algorithms w/o averaging cannot compete against second-order algorithms.
When averaging is used, the accuracies of all algorithms become very close to each other.
Averaging accelerates not only first-order algorithms but also second-order algorithms.
Investigate multiclass learning first.
Both one-versus-the-rest learning and multiclass learning achieve similar accuracy.
However, one-versus-the-rest takes much longer CPU time to converge than multiclass does.