3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Jointly Modeling Deep Video and Compositional Text to Bridge Vision and Language in a Unified Framework

      , , ,
      Proceedings of the AAAI Conference on Artificial Intelligence
      Association for the Advancement of Artificial Intelligence (AAAI)

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Recently, joint video-language modeling has been attracting more and more attention. However, most existing approaches focus on exploring the language model upon on a fixed visual model. In this paper, we propose a unified framework that jointly models video and the corresponding text sentences. The framework consists of three parts: a compositional semantics language model, a deep video model and a joint embedding model. In our language model, we propose a dependency-tree structure model that embeds sentence into a continuous vector space, which preserves visually grounded meanings and word order. In the visual model, we leverage deep neural networks to capture essential semantic information from videos. In the joint embedding model, we minimize the distance of the outputs of the deep video model and compositional language model in the joint space, and update these two models jointly. Based on these three parts, our system is able to accomplish three tasks: 1) natural language generation, and 2) video retrieval and 3) language retrieval. In the experiments, the results show our approach outperforms SVM, CRF and CCA baselines in predicting Subject-Verb-Object triplet and natural sentence generation, and is better than CCA in video retrieval and language retrieval tasks. 

          Related collections

          Author and article information

          Journal
          Proceedings of the AAAI Conference on Artificial Intelligence
          AAAI
          Association for the Advancement of Artificial Intelligence (AAAI)
          2374-3468
          2159-5399
          March 01 2015
          February 19 2015
          : 29
          : 1
          Article
          10.1609/aaai.v29i1.9512
          91aa66b0-4796-4e82-b6f3-5267c8b99cbc
          © 2015
          History

          Comments

          Comment on this article