English/Japanese

Yoshitaka Ushiku

Talk Slides

Keynotes

Frontiers of Vision and Language: Bridging Images and Texts by Deep Learning from Yoshitaka Ushiku

News

12/20/2017

Pleased to announce that our paper, "Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture", written with Katsunori Ohnishi and Shohei Yamamoto is accepted as a oral presentation to AAAI Conference on Artificial Intelligence 2018! This paper addresses a challenging task: video generation from scratch. The proposed pipeline comprises two steps, the first step for generating a motion and second step for generating its texture, which enables more realistic videos than the existing methods.

12/20/2017

Pleased to announce that our paper, "Alternating Circulant Random Features for Semigroup Kernels", written with Yusuke Mukuta is accepted to AAAI Conference on Artificial Intelligence 2018! This paper Proposes a feature extraction for approximating Semi-group Kernels. The proposed method mitigates the problems that the existing methods utilizing sign-flipping do not suite for Laplacian features necessary for approximating Semi-group Kernels.

10/27/2017

Pleased to announce that our framework for fast inference using deep learning on web browsers, WebDNN, provided with Masatoshi Hidaka and Yuichiro Kikura received Honorable Mention on ACM Multimedia Open Source Software Competition! See the details of WebDNN via the project page on GitHub.

10/23/2017

Pleased to announce that our paper, "Spatio-temporal Person Retrieval via Natural Language Queries", written with Masataka Yamaguchi is accepted to International Conference on Computer Vision (ICCV) 2017! In this paper, we proposed a method to retrieve a 10 second video clip where "a girl in a blue dress playing hopscotch outside" with this natural language query from a video database more than 40 minutes.

5/13/2017

Pleased to announce that our paper, "Asymmetric Tri-training for Unsupervised Domain Adaptation", written with Kuniaki Saito is accepted to International Conference on Machine Learning (ICML) 2017! In this paper, we proposed a neural net architecture for Unsupervised Domain Adaptation, where all samples in target source is unlabeled. We introduce Tri-training [Zhou+Li, 2005], which is a semi-supervised learning method, to associate pseudo labels to unlabeled samples and achieve state-of-the-art performance on several tasks for Unsupervised Domain Adaptation.

2/28/2017

Pleased to announce that our paper, "DualNet: Domain-Invariant Network for Visual Question Answering", written with Kuniaki Saito and Andrew Shin is accepted as a oral presentation to International Conference on Multimedian and Expo (ICME) 2017! In this paper, we proposed a neural net architecture that won the first place at abstract image task in Visual Question Answering (VQA) Challenge, which is held in conjunction with CVPR 2016.

7/15/2016

Pleased to announce that our paper, "Image Captioning with Sentiment Terms via Weakly-Supervised Sentiment Dataset", written with Andrew Shin is accepted to British Machine Vision Conference (BMVC) 2016! In this paper, we proposed a method to generate not a normal caption but a caption with sentiment term. I am so happy since this paper is the first one that I wrote after my assignment to this university.

Moreover, we won the first place at abstract image task in Visual Question Answering (VQA) Challenge, which is held in conjunction with CVPR 2016. This challenge is a competition comparing accuracies of VQA, a QA system which returns an answer for a question sentence and associated illust.

6/1/2016

Web page renewal.

4/1/2016

Excited to announce that I am assigned to Department of Mechano-Informatics, Graduate School of Information Science and Technology, the University of Tokyo as a lecturer. I would like to continue my research on cross-media understanding and generation at Harada-Ushiku Laboratory.

Education

2009
BS of Engineering (The Univeresity of Tokyo)
2011
MA of Information Science and Technology (The University of Tokyo)
2014
Ph.D. (The University of Tokyo)

Profession

Apr. 2013 - Mar. 2014
Research Fellow, Japan Society for Promotion of Science
June 2013 - Aug. 2013
Intern, Microsoft Research Redmond
Apr. 2014 - Mar. 2016
Research Scientist, NTT Communication Science Laboratories.
Apr. 2016 -
Lecturer, Department of Mechano-Informatics, Graduate School of Information Science and Technology, the University of Tokyo

Affiliation

Machine Intelligence Lab.
Dept. of Mechano-Informatics
Grad School of Information Science & Technology, The University of Tokyo (Japan)

Contact

Papers

International Conference (refereed)

  1. Katsunori Ohnishi, Shohei Yamamoto, Yoshitaka Ushiku, Tatsuya Harada. Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture. AAAI Conference on Artificial Intelligence (AAAI), 2018. (oral presentation)
  2. Yusuke Mukuta, Yoshitaka Ushiku, Tatsuya Harada. Alternating Circulant Random Features for Semigroup Kernels. AAAI Conference on Artificial Intelligence (AAAI), 2018.
  3. Masatoshi Hidaka, Yuichiro Kikura, Yoshitaka Ushiku, Tatsuya Harada. WebDNN: Fastest DNN Execution Framework on Web Browser. ACM International Conference on Multimedia (ACMMM), Open Source Software Competition, pp.1213-1216, 2017.
  4. Masataka Yamaguchi, Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada. Spatio-temporal Person Retrieval via Natural Language Queries. IEEE International Conference on Computer Vision (ICCV), 2017.
  5. Qishen Ha, Kohei Watanabe, Takumi Karasawa, Yoshitaka Ushiku, Tatsuya Harada. MFNet: Towards Real-Time Semantic Segmentation for Autonomous Vehicles with Multi-Spectral Scenes. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.
  6. Kuniaki Saito, Yoshitaka Ushiku, and Tatsuya Harada. Asymmetric Tri-training for Unsupervised Domain Adaptation. International Conference on Machine Learning (ICML), pp.2988-2997, 2017.
  7. Kuniaki Saito, Andrew Shin, Yoshitaka Ushiku, and Tatsuya Harada. DualNet: Domain-Invariant Network for Visual Question Answering. IEEE International Conference on Multimedia and Expo (ICME), pp.829-834, 2017. (oral presentation)
  8. Andrew Shin, Yoshitaka Ushiku, and Tatsuya Harada. Image Captioning with Sentiment Terms via Weakly-Supervised Sentiment Dataset. British Machine Vision Conference (BMVC), pp.53.1-53.12, 2016.
  9. Yoshitaka Ushiku, Masataka Yamaguchi, Yusuke Mukuta, and Tatsuya Harada. Common subspace for model and similarity: Phrase learning for caption generation from images. IEEE International Conference on Computer Vision (ICCV), pp.2668-2676, 2015. (acceptance rate: 30.9%)
  10. Yoshitaka Ushiku, Masatoshi Hidaka, and Tatsuya Harada. Three guidelines of online learning for large-scale visual recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3574-3581, 2014. (acceptance rate: 29.9%)
  11. Asako Kanezaki, Shogo Inaba, Yoshitaka Ushiku, Yukihiko Yamashita, Hiroaki Muraoka, Yasuo Kuniyoshi, and Tatsuya Harada. Hard negative classes for multiple object detection. IEEE International Conference on Robotics and Automation (ICRA), pp.3066-3073, 2014.
  12. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Efficient Image Annotation for Automatic Sentence Generation. ACM International Conference on Multimedia (ACMMM), pp.549-558, 2012. (full paper, acceptance rate: 20.2%)
  13. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Understanding Images with Natural Sentences. ACM International Conference on Multimedia (ACMMM), Multimedia Grand Challenge, pp.679-682, 2011. (Special Prize on the Best Application of a Theoretical Framework) [pdf]
  14. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Automatic Sentence Generation from Images. ACM International Conference on Multimedia (ACMMM), pp.1533-1536, 2011. (short, acceptance rate: usually 30%) [pdf]
  15. Tatsuya Harada, Yoshitaka Ushiku, Yuya Yamashita, and Yasuo Kuniyoshi. Discriminative Spatial Pyramid. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1617-1624, 2011. (acceptance rate: 26.4%) [pdf]
  16. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Improvement of Image Similarity Measures for Image Browsing and Retrieval Via Latent Space Learning between Images and Long Texts. IEEE International Conference on Image Processing (ICIP), pp.2365-2368, 2010. [pdf]

International Conference (unrefereed)

  1. Yoshitaka Ushiku, Hiroshi Muraoka, Sho Inaba, Teppei Fujisawa, Koki Yasumoto, Naoyuki Gunji, Takayuki Higuchi, Yuko Hara, Tatsuya Harada, and Yasuo Kuniyoshi. ISI at ImageCLEF 2012: Scalable System for Image Annotation. the 3rd Conference and Labs of the Evaluation Forum (CLEF 2012), pp.1-12, 2012.

Technical Report

  1. Kuniaki Saito, Yusuke Mukuta, Yoshitaka Ushiku, Tatsuya Harada. Deep Modality Invariant Adversarial Network for Shared Representation Learning. The 16th International Conference on Computer Vision Workshop on Transferring and Adapting Source Knowledge in Computer Vision (ICCV, Workshop), 2017.
  2. Yusuke Mukuta, Yoshitaka Ushiku, Tatsuya Harada. Spatial-Temporal Weighted Pyramid using Spatial Orthogonal Pooling. The 16th International Conference on Computer Vision Workshop on Compact and Efficient Feature Representation and Learning in Computer Vision (ICCV, Workshop), 2017.
  3. Takumi Karasawa, Kohei Watanabe, Qishen Ha, Antonio Tejero-De-Pablos, Yoshitaka Ushiku, Tatsuya Harada. Multispectral Object Detection for Autonomous Vehicles. The 25th Annual ACM International Conference on Multimedia (ACMMM), 2017, (workshop).
  4. Yoshitaka Ushiku, Hiroshi Muraoka, Sho Inaba, Teppei Fujisawa, Koki Yasumoto, Naoyuki Gunji, Takayuki Higuchi, Yuko Hara, Tatsuya Harada, and Yasuo Kuniyoshi. ISI at ImageCLEF 2012: Scalable System for Image Annotation. the 3rd Conference and Labs of the Evaluation Forum (CLEF 2012), pp.1-12, 2012.

Domestic Journal (refereed, In Japanese)

Go to japanese page for domestic papers.

Domestic Conference (refereed, In Japanese)

Go to japanese page for domestic papers.

Domestic Conference (unrefereed, In Japanese)

Go to japanese page for domestic papers.

Invited Talks

  1. Yoshitaka Ushiku. Frontiers of Vision and Language: Bridging Images and Texts by Deep Learning. Workshop of Machine Learning under International Conference on Document Analysis and Recognition, 2017/11/11.
  2. Yoshitaka Ushiku. Recognize, Describe, and Generate: Introduction of Recent Work at MIL. GPU Technology Conference, 2017/05/11.
  3. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Efficient Image Annotation for Automatic Sentence Generation. Greater Tokyo Area Multimedia/Vision Workshop, 2012/08/30.

Go to japanese page for domestic talks.

Awards and Competitions

  1. 2017. Honorable Mention. ACM Multimedia Open Source Software Competition.
  2. 2016. First place in the abstract image task. Visual Question Answering Challenge 2016.
  3. 2012. First place in the fine-grained classification task, second place in the classification task. Large Scale Visual Recognition Challenge 2012 (ILSVRC2012).
  4. 2011. Special Prize on the Best Application of a Theoretical Framework. ACM Mutlimedia Grand Challenge.
  5. 2011. Third place in the classification task, second place in the detection task. Large Scale Visual Recognition Challenge 2011 (ILSVRC2011).
  6. 2010. Third place. Large Scale Visual Recognition Challenge 2010 (ILSVRC2010).

Go to japanese page for domestic awards.