Generative Pre-Trained Transformer for Design Concept Generation: An Exploration

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Novel concepts are essential for design innovation and can be generated with the aid of data stimuli and computers. However, current generative design algorithms focus on diagrammatic or spatial concepts that are either too abstract to understand or too detailed for early phase design exploration. This paper explores the uses of generative pre-trained transformers (GPT) for natural language design concept generation. Our experiments involve the use of GPT-2 and GPT-3 for different creative reasonings in design tasks. Both show reasonably good performance for verbal design concept generation.

Related collections

Most cited references 49

Record: found
Abstract: found
Article: not found

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar … (2017)

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data. 15 pages, 5 figures

0 comments Cited 3262 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Reducing the dimensionality of data with neural networks.

G E Hinton, R R Salakhutdinov (2006)

High-dimensional data can be converted to low-dimensional codes by training a multilayer neural network with a small central layer to reconstruct high-dimensional input vectors. Gradient descent can be used for fine-tuning the weights in such "autoencoder" networks, but this works well only if the initial weights are close to a good solution. We describe an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.

0 comments Cited 1674 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, Nick Ryder … (2020)

Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general. 40+32 pages

0 comments Cited 1000 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Title: Proceedings of the Design Society

Abbreviated Title: Proc. Des. Soc.

Publisher: Cambridge University Press (CUP)

ISSN (Electronic): 2732-527X

Publication date Created: May 2022

Publication date (Electronic): May 26 2022

Publication date (Print): May 2022

Volume: 2

Pages: 1825-1834

Article

DOI: 10.1017/pds.2022.185

SO-VID: d4baaad9-7e75-4b02-9d71-35cf69c611b6

License:

http://creativecommons.org/licenses/by-nc-nd/4.0/

History

Data availability:

Comments

Comment on this article

scite_

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.