Voozh

Abstract

In this paper three problems for a connectionist account of language are considered:

1. What is the nature of linguistic representations?

2. How can complex structural relationships such as constituent structure be represented?

3. How can the apparently open-ended nature of language be accommodated by a fixed-resource system?

Using a prediction task, a simple recurrent network (SRN) is trained on multiclausal sentences which contain multiply-embedded relative clauses. Principal component analysis of the hidden unit activation patterns reveals that the network solves the task by developing complex distributed representations which encode the relevant grammatical relations and hierarchical constituent structure. Differences between the SRN state representations and the more traditional pushdown store are discussed in the final section.

References

Baker, C.L.(1979). Syntactic theory and the projection problem. Linguistic Inquiry, 10, 533-581.
Google Scholar
Bates, E., & MacWhinney, B.(1982). Functionalist approaches to grammar.In E.Wanners, & L.Gleitman (Eds.), Language acquisition:The state of the art. New York: Cambridge University Press.
Google Scholar
Chafe, W.(1970). Meaning and the structure of language. Chicago: University of Chicago Press.
Google Scholar
Chalmers, D.J.(1990). Syntactic transformations on distributed representations.Center for Research on Concepts and Cognition, Indiana University.
Chomsky, N.(1957). Syntactic structures.The Hague:Mouton.
Dell, G.(1986). A spreading activation theory of retrieval in sentence production. Psychological Review, 93, 283-321.
Google Scholar
Dolan, C., & Dyer, M.G.(1987). Symbolic schemata in connectionist memories:Role binding and the evolution of structure (Technical Report UCLA-AI-87-11). Los Angeles, CA: University of California, Los Angeles, Arti-ficial Intelligence Laboratory.
Google Scholar
Dolan, C.P., & Smolensky, P.(1988). Implementing a connectionist production system using tensor products (Technical Report UCLA-AI-88-15). Los Angeles, CA: University of California, Los Angeles, Artificial Intelligence Laboratory.
Google Scholar
Elman, J.L.(1989). Representation and structure in connectionist models (Technical Report CRL-8903). San Diego, CA: University of California, San Diego, Center for Research in Language.
Google Scholar
Elman, J.L.(1990). Finding structure in time. Cognitive Science, 14, 179-211.
Google Scholar
Fauconnier, G.(1985)Mental spaces. Cambridge, MA: MIT Press.
Google Scholar
Feldman, J.A. & Ballard, D.H.(1982). Connectionist models and their properties. Cognitive Science, 6, 205-254.
Google Scholar
Fillmore, C.J.(1982). Frame semantics.In Linguistics in the morning calm. Seoul: Hansin.
Google Scholar
Flury, G.(1988). Common principal components and related multivariate models. New York: Wiley.
Google Scholar
Fodor, J.(1976). The language of thought. Harvester Press, Sussex.
Google Scholar
Fodor, J., & Pylyshyn, Z.(1988). Connectionism and cognitive architecture:A critical analysis.In S.Pinker & J.Mehler (Eds.), Connections and symbols. Cambridge, MA: MIT Press.
Google Scholar
Forster, K.I.(1979). Levels of processing and the structure of the language processor.In W.E.Cooper, & E. Walker (Eds.), Sentence processing:Psychotinguistic studies presented to Merrill Garrett. Hillsdale NJ: Lawrence Erlbaum Associates.
Google Scholar
Gasser, M., & Lee, C-D.(1990). Networks that learn phonology.Computer Science Department, Indiana University.
Givon, T.(1984). Syntax:A functional-typological introduction.Volume 1. Amsterdam: John Benjamins.
Google Scholar
Gold, E.M.(1967). Language identification in the limit. Information and Control, 16, 447-474.
Google Scholar
Gonzalez, R.C., & Wintz, P.(1977). Digital image processing. Reading, MA: Addison-Wesley.
Google Scholar
Grosjean, F.(1980). Spoken word recognition processes and the gating paradigm. Perception & Psychophysics, 28, 267-283.
Google Scholar
Hanson, S.J., & Burr, D.J.(1987). Knowledge representation in connectionist networks. Bell Communications Research, Morristown, New Jersey.
Google Scholar
Hare, M.(1990). The role of similarity in Hungarian vowel harmony:A connectionist account (CRL Technical Report 9004). San Diego, CA: University of California, Center for Research in Language.
Google Scholar
Hare, M., Corina, D., & Cottrell, G.(1988). Connectionist perspective on prosodic structure (CRL Newsletter, Vol. 3, No. 2). San Diego, CA: University of California, Center for Research in Language.
Google Scholar
Hinton, G.E.(1988). Representing part-whole hierarchies in connectionist networks (Technical Report CRG-TR-88-2).University of Toronto, Connectionist Research Group.
Hinton, G.E., McClelland, J.L., & Rumelhart, D.E.(1986). Distributed representations, in D.E.Rumelhart, & J.L.McClelland (Eds.), Parallel distributed processing:Explorations in the microstructure of cognition (Vol. 1). Cambridge, MA: MIT Press.
Google Scholar
Hopper, P.J., & Thompson, S.A.(1980). Transitivity in grammar and discourse. Language, 56, 251-299.
Google Scholar
Hornik, K., Stinchcombe, M., & White, H.(in press). Multi-layer feedforward networks are universal approx-imators. Neural Networks.
Jordan, M.I.(1986). Serial order:A parallel distributed processing approach (Technical Report 8604). San Diego, CA: University of California, San Diego, Institute for Cognitive Science.
Google Scholar
Kawamoto, A.H.(1988). Distributed representations of ambiguous words and their resolution in a connectionist network.In S.L.Small, G.W.Cottrell, & M.K.Tanenhaus (Eds.), Lexical ambiguity resolution:Perspectives from psycholinguistics, neuropsychology, and artificial intelligence. SanMateo, CA: Morgan Kaufmann Publishers.
Google Scholar
Kirsh, D.(in press). When is information represented explicitly?In J.Hanson (Ed.), Information, thought, and content. Vancouver: University of British Columbia.
Kuno, S.(1987). functional syntax:Anaphora, discourse and empathy. Chicago: The University of Chicago Press.
Google Scholar
Kutas, M.(1988). Event-related brain potentials (ERPs)elicited during rapid serial presentation of congruous and incongruous sentences.In R.Rohrbaugh, J.Rohrbaugh, & P.Parasuramen (Eds.), Current trends in brain potential research (EEG Supplement 40). Amsterdam: Elsevier.
Google Scholar
Kutas, M., & Hillyard, S.A.(1980). Reading senseless sentences:Brain potentials reflect semantic inconguity. Science, 207, 203-205.
Google Scholar
Lakoff, G.(1987). Women, fire, and dangerous things:What categories reveal about the mind. Chicago: University of Chicago Press.
Google Scholar
Langacker, R.W.(1987). Foundations of cognitive grammar:Theoretical perspectives.Volume 1. Stanford: Stanford University Press.
Google Scholar
Langacker, R.W.(1988). A usage-based model. Current Issues in Linguistic Theory, 50, 127-161.
Google Scholar
MacWhinney, B., Leinbach, J., Taraban, R., & McDonald, J.(1989). Language learning:Cues or rules?Journal of Memory and Language, 28, 255-277.
Google Scholar
Marslen-Wilson, W., & Tyler, L.K.(1980). The temporal structure of spoken language understanding. Cognition, 8, 1-71.
Google Scholar
McClelland, J.L.(1987). The case for interactionism in language processing.In M.Coltheart (Ed.), Attention and performance XII:The psychology of reading. London: Erlbaum.
Google Scholar
McClelland, J.L., St.John, M., & Taraban, R.(1989). Sentence comprehension:A parallel distributed processing approach.Manuscript, Department of Psychology, Carnegie Mellon University.
McMillan, C, & Smolensky, P.(1988). Analyzing a connectionist model as a system of soft rules (Technical Report CU-CS-303-88).University of Colorado, Boulder, Department of Computer Science.
Miikkulainen, R., & Dyer, M.(1989a). Encoding input/output representations in connectionist cognitive systems. In D.S.Touretzky, G.E.Hinton, & T.J.Sejnowski (Eds.), Proceedings of the 1988 Connectionist Models Summer School. Los Altos, CA: Morgan Kaufmann Publishers.
Google Scholar
Miikkulainen, R., & Dyer, M.(1989b). A modular neural network architecture for sequential paraphrasing of script-based stories.In Proceedings of the International Joint Conference on Neural Networks, IEEE.
Mozer, M.(1988). A focused back-propagation algorithm for temporal pattern recognition.(Technical Report CRG-TR-88-3).University of Toronto, Departments of Psychology and Computer Science.
Mozer, M.C., & Smolensky, P.(1989). Skeletoniwtion:A technique for trimming the fat from a network via relevance assessment (Technical Report CU-CS-421-89).University of Colorado, Boulder, Department of Computer Science.
Oden, G.(1978). Semantic constraints and judged preference for interpretations of ambiguous sentences. Memory and Cognition, 6, 26-37.
Google Scholar
Pinker, S.(1989). Learnability and cognition:The acquisition of argument structure, Cambridge, MA: MIT Press.
Google Scholar
Pollack, J.B.(1988). Recursive auto-associative memory:Decising compositional distributed representations. Proceedings of the Tenth Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Pollack, J.B.(in press). Recursive distributed representations. Artificial Intelligence.
Ramsey, W.(1989). The philosophical implications of connectionism.Ph.D.thesis, University of California, San Diego.
Google Scholar
Reich, P.A., & Dell, G.S.(1977). Finiteness and embedding.In E.L.Blansitt, Jr., & P.Maher (Eds.), The third LACUS forum. Columbia, SC: Hornbeam Press.
Google Scholar
Rumelhart, D.E., Hinton, G.E., & Williams, R.J.(1986). Learning internal representations by error propagation. In D.E.Rumelhart, & J.L.McClelland (Eds.), Parallel distributed processing:Explorations in the microstruc-ture of cognition (Vol. 1). Cambridge, MA: MIT Press.
Google Scholar
Rumelhart, D.E., & McClelland, J.L.(1986a). PDP Models and general issues in cognitive science.In D.E. Rumelhart, & J.L.McClelland (Eds.), Parallel distributed processing:Explorations in the microstructure of cognition (Vol. 1). Cambridge, MA: MIT Press.
Google Scholar
Rumelhart, D.E., & McClelland, J.L.(1986b). On learning the past tenses of English verbs.In D.E.Rumelhart, & J.L.McClelland (Eds.), Parallel distributed processing:Explorations in the microstructure of cognition (Vol. 1). Cambridge, MA: MIT Press.
Google Scholar
Salasoo, A., & Pisoni, D.B.(1985). Interaction of knowledge sources in spoken word identification. Journal of Memory and Language, 24, 210-231.
Google Scholar
Sanger, D.(1989). Contribution analysis:A technique for assigning responsibilities to hidden units in connectionist networks (Technical Report CU-CS-435-89).University of Colorado, Boulder, Department of Computer Science.
Schlesinger, I.M.(1971). On linguistic competence.IN Y.Bar-Hillel (Ed.), Pragmatics of natural languages. Dordrecht, Holland:Reidel.
Google Scholar
Sejnowski, T.J., & Rosenberg, C.R.(1987). Parallel networks that learn to pronounce English text. Complex Systems, 1, 145-168.
Google Scholar
Servan-Schreiber, D., Cleeremans, A., & McClelland, J.L.(1991). Graded state machines:The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7, 161-193.
Google Scholar
Shastri, L., & Ajjanagadde, V.(1989). A conneclionist system for rule based reasoning with multi-place predicates and variables (Technical Report MS-CIS-8905).University of Pennsylvania, Computer and Information Science Department.
Smolensky, P.(1987a). On variable binding and the representation of symbolic structures in connectionist systems (Technical Report CU-CS-355-87).University of Colorado, Boulder, Department of Computer Science.
Smolensky, P.(1987b). On the proper treatment of connectionsm (Technical Report CU-CS-377-87).University of Colorado, Boulder, Department of Computer Science. Smolensky, P.(1987c).Putting together connectionism-again (Technical Report CU-CS-378-87).University of Colorado, Boulder, Department of Computer Science.
Smolensky, P.(1988). On the proper treatment of connectionism. The Behavioral and Brain Sciences, 11.
Smolensky, P.(in press). Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artificial Intelligence.
St.John, M., & McClelland, J.L.(in press). Learning and applying contextual constraints in sentence comprehension (Technical Report). Pittsburgh, PA: Carnegie Mellon University, Department of Psychology.
Stemberger, J.P.(1985). The lexicon in a model of language production. New York: Garland Publishing.
Google Scholar
Stinchcombe, M., & White, H.(1989). Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions. Proceedings of the International Joint Conference on Neural Networks, Washington, D.C.
Stolz, W.(1967). A study of the ability to decode grammatically novel sentences. Journal of Verbal Learning and Verbal Behavior, 6, 867-873.
Google Scholar
Tanenhaus, M.K., Garnseyh, S.M., & Boland, J.(in press). Combinatory lexical information and language comprehension.In G.Altmann (Ed.), Cognitive models of speech processing:Psycholinguistic and computational perspectives. Cambridge, MA: MIT Press.
Touretzky, D.S.(1986). BoltzCONS:Reconciling connectionism with the recursive nature of stacks and trees. Proceedings of the Eight Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Touretzky, D.S.(1989). Rules and maps in connectionist symbol processing (Technical Report CMU-CS-89-158). Pittsburgh, PA: Carnegie Mellon University, Department of Computer Science.
Google Scholar
Touretzky, D.S.(1989). Towards a connectionist phonology:The "many maps"approach to sequence manipulation. Proceedings of the nth Annual Conference of the Cognitive Science Society, 188-195.
Touretzky, D.S., & Hinton, G.E.(1985). Symbols among the neurons:Details of a connectionist inference architecture. Proceedings of the Ninth International Joint Conference on Artificial Intelligence, Los Angeles.
Touretzky, D.S., & Wheeler, D.W.(1989). A connectionist implementation of cognitive phonology (Technical Report CMU-CS-89-144). Pittsburgh, PA: Carnegie Mellon University, School of Computer Science.
Google Scholar
Van Gelder, T.J.(in press). Compositionality:Variations on a classical theme. Cognitive Science.

Download references

Author information

Authors and Affiliations

Departments of Cognitive Science and Linguistics, University of California, San Diego.
Jeffrey L. Elman

Authors

Jeffrey L. Elman
View author publications
Search author on:PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Elman, J.L. Distributed Representations, Simple Recurrent Networks, And Grammatical Structure. Machine Learning 7, 195–225 (1991). https://doi.org/10.1023/A:1022699029236

Download citation

Issue date: September 1991
DOI: https://doi.org/10.1023/A:1022699029236

URL: https://link.springer.com/article/10.1023/A:1022699029236?error=cookies_not_supported&code=6a67db87-a51b-4e5c-9c02-ea9731527ecd

⇱ Distributed Representations, Simple Recurrent Networks, And Grammatical Structure | Machine Learning | Springer Nature Link

Distributed Representations, Simple Recurrent Networks, And Grammatical Structure

Abstract

Article PDF

Similar content being viewed by others

Evidence for the representation of non-hierarchical structures in language

Heterogeneous recurrent neural networks for natural language model

Parallelizable Simple Recurrent Units with Hierarchical Memory

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

URL: https://link.springer.com/article/10.1023/A:1022699029236?error=cookies_not_supported&code=6a67db87-a51b-4e5c-9c02-ea9731527ecd

⇱ Distributed Representations, Simple Recurrent Networks, And Grammatical Structure | Machine Learning | Springer Nature Link

Distributed Representations, Simple Recurrent Networks, And Grammatical Structure

Abstract

Article PDF

Similar content being viewed by others

Evidence for the representation of non-hierarchical structures in language

Heterogeneous recurrent neural networks for natural language model

Parallelizable Simple Recurrent Units with Hierarchical Memory

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article