We present a novel dataset and a novel algorithm for recognizing activities of daily living (ADL) from a first-person wearable camera. Handled objects are crucially important for egocentric ADL recognition. For specific examination of objects related to users' actions separately from other objects in an environment, many previous works have addressed the detection of handled objects in images captured from head-mounted and chest-mounted cameras. Nevertheless, detecting handled objects is not always easy because they tend to appear small in images. They can be occluded by a user's body. As described herein, we mount a camera on a use's wrist. A wrist-mounted camera can capture handled objects at a large scale, and thus it enables us to skip object detection process. To compare a wrist-mounted camera and a head-mounted camera, we also develop a novel and publicly available dataset that includes videos and annotations of daily activities captured simultaneously by both cameras. Additionally, we propose a discrimina- tive video representation that retains spatial and temporal information after encoding frame descriptors extracted by Convolutional Neural Networks.
Katsunori Ohnishi, Atsushi Kanehira, Asako Kanezaki, Tatsuya Harada, "Recognizing Activities of Daily Living with a Wrist-mounted Camera" IEEE Computer Vision and Pattern Recognition (CVPR), Las Vegas, America, 2016, Spotlight presentation (acceptance rate=9.7%)
Contact: ohnishi (at) mi.t.u-tokyo.ac.jp, harada (at) mi.t.u-tokyo.ac.jp