35
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Spell Once, Summon Anywhere: A Two-Level Open-Vocabulary Language Model

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We show how to deploy recurrent neural networks within a hierarchical Bayesian language model. Our generative story combines a standard RNN language model (generating the word tokens in each sentence) with an RNN-based spelling model (generating the letters in each word type). These two RNNs respectively capture sentence structure and word structure, and are kept separate as in linguistics. The model can generate spellings for novel words in context and thus serves as an open-vocabulary language model. For known words, embeddings are naturally inferred by combining evidence from type spelling and token context. We compare to a number of baselines and previous work, establishing state-of-the-art results.

          Related collections

          Most cited references12

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Neural Machine Translation of Rare Words with Subword Units

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A Bayesian framework for word segmentation: exploring the effects of context.

            Since the experiments of Saffran et al. [Saffran, J., Aslin, R., & Newport, E. (1996). Statistical learning in 8-month-old infants. Science, 274, 1926-1928], there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words. In this work, we use computational modeling to explore the effects of different assumptions the learner might make regarding the nature of words--in particular, how these assumptions affect the kinds of words that are segmented from a corpus of transcribed child-directed speech. We develop several models within a Bayesian ideal observer framework, and use them to examine the consequences of assuming either that words are independent units, or units that help to predict other units. We show through empirical and theoretical results that the assumption of independence causes the learner to undersegment the corpus, with many two- and three-word sequences (e.g. what's that, do you, in the house) misidentified as individual words. In contrast, when the learner assumes that words are predictive, the resulting segmentation is far more accurate. These results indicate that taking context into account is important for a statistical word segmentation strategy to be successful, and raise the possibility that even young infants may be able to exploit more subtle statistical patterns than have usually been considered.
              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation

                Bookmark

                Author and article information

                Journal
                22 April 2018
                Article
                1804.08205
                63137de4-945e-4fb5-a7c3-2bfbcc416c62

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.CL

                Theoretical computer science
                Theoretical computer science

                Comments

                Comment on this article